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1.  INTRODUCTION 

A phonoloqical  rules  system  has  been  implemented  as  a 
lanquaqe  extension  of  SDC  INFIX  LISP[V).  The  system  can  be 
used  in  two  modes:  (1)  as  an  interactive  rule  tester,  and 

(2)  as  a library  of  functions  with  other  'LISP  programs.  The 
key  lanquaqe  capabilities  are: 

•definitions  of  phonoloqical  rules, 

•definitions  of  ordered  rule  application, 

•definitions  of  unordered  rule  application, 

©definitions  of  nondeterministic  rule  application, 
•lexicon  definitions, 

•sub-lexicon  definitions,  and 
•multiple  forms  for  words  in  the  lexicon. 

The  key  system  capabilities  and  features  are: 

•ability  to  edit  and  recompile  all  definitions, 

•ability  to  output  symbolics  to  a terminal,  printer, 
or  disk  file  in  a format  that  allows  recompilation, 
•ability  to  selectively  test  a rule  or  group  of  rules 
against  a sinqle  form,  a lexicon  entry,  a sub-lexicon, 
or  the  entire  lexicon,  and 
•all  phonoloqical  rule  definitions  and  all  rule 
appiyinq  subr  definitions  are  compiled  rather 
than  interpreted. 

This  document  describes  the  phones  and  their  features,  the 
individual  commands,  the  available  editor,  and  the  system. 
Appendix  I qives  a formal  BN F definition  of  the  commands. 


2 • PHONES  AND  THEIR  FEATURES 

Appendix  2 summarizes  the  available  phonetic  symbols  and 
their  features.  The  symbols  used  are  from  the  ARPAbet*. 
Each  phone  has  a KIND.  The  possible  values  of  KIND  are 
VOWEL,  BOUNDARY,  and  CONST  (consonant). 


2.1  VOWEL  FEATURES 

All  vowels  have  the  feature  VOICE  and  a stress  level.  There 
are  three  levels  of  stress:  0,  1,  and  2.  Stress  level  0 
means  reduced,  stress  level  1 means  unstressed,  and  stress 


1 The  ARPAbet  is  a phonetic  representation  agreed  upon  by  a 
qroup  of  AFP  A contractors  for  transmitting  phonetic  strings 
to  and  from  a computer. 
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level  2 means  stressed.  Schwa  (AX)  is  always  assumed  to  be 
reduced.  Other  vowels  may  have  their  level  of  stress 
specified  by  following  the  vowel  name  with  a colon  and  one 
of  the  inteqers  0,  1,  or  2.  For  trample,  a reduced  IH  is 

written  IF:0,  an  unstressed  OH  is  written  0W:1,  and  a 
stressed  ER  is  written  ER:2. 


2.2  BOUNDARY  FEATURES 

There  are  twc  boundary  phones:  *,  which  is  a syllable 

boundary,  and  #,  which  is  a word  boundary.  Neither  is 
marked  with  any  features  other  than  BOUNDARY, 


2.3  CONSONANT  FEATURES 

All  other  phones  are  consonants  and  therefore  have  the 
feature  CONST.  Each  consonant  also  has  a CLASS  feature,  may 
be  marked  VOICE  or  not.,  and  may  have  a place  of  articulation 
specified.  The  possible  values  of  CLASS  are  NASAL,  PLOS, 
FPIC,  AFRIC,  GLIDE,  LATERAL,  CENTRAL,  and  MISC.  The 
respective  meaniuqs  are  nasal,  plosive,  fricative, 
affricative,  qlide,  L,  R,  and  miscellaneous.  The  presence 
of  the  feature  VOICE  means  that  the  phone  is  voiced.  All 
the  consonants  except  HH  and  the  glottal  stop  (Q)  have  a 
Place  of  articulation  feature.  The  values  of  PLACF  are 
LABIAL,  DENTAL,  ALVEOLAR,  ALVPAL  (alveolar-palatal),  VELAR 
and  PALATAL. 


2-a  PHONOLOGICAL  SPELLINGS  OF  LEXICAL  FORMS 

Several  system  commands  input  an  argument  form  spelled  in 
AFPAbet.  The  format  of  the  spelling  is  a sequence  of  phone 
names  enclosed  in  parentheses.  For  example,  a phonological 
spellinq  of  HAVING  is 

(HH  AE:2  V*I H:0  NX) 

and  a spellinq  of  BI~  HOUSES  is 

( B IF  G# HH  AH:  2 Z * AX  Z) 

Several  points  should  be  noted  about  the  input  format. 
First,  it  is  not  necessary  to  enter  the  left  and  right  word 
boundaries  (t)  because  the  system  automatically  adds  them. 
Second,  if  a vowel  other  than  schwa  is  entered  without  an 
explicit  stress  level  (: 0,  ; 1 , or  :2  following  the  vowel) 
then  the  stress  level  is  assumed  to  be  1.  Third,  all 
contiguous  pairs  of  symbols  in  a phonological  spellinq  must 
be  separated  from  each  other  by  one  or  more  blanks  unless  at 


- - 
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least  one  of  them  is  M(M#  ")M»  "#",  or  " In  the 

latter  case,  though  not  required,  blanks  are  permissible. 


3.  BULE_ DEFINITION 

This  section  describes  the  command  that  causes  a 
phonoloqical  rule  to  be  compiled  into  the  system.  A formal 
syntax  description  of  <rule>  is  qiven  in  Appendix  1.  The 
rule  language  is  also  described  by  Barnett  in  [2% 


A rule  definition  is  introduced  by  $ followed  by  a blank  and 
the  rule  name.  The  rule  name  is  an  identifier  — a sequence 
of  letters,  diqits,  and  periods,  the  first  of  which  is  not  a 
digit.  The  rest  of  the  rule  consists  of  a left  side,  an 
equal  sign  ( = ) , a riqht  side,  and  an  optional  conditional 
phrase.  The  riqht  side  of  a rule  specifies  a pattern 
(schemata)  of  phone  sequences.  If  the  riqht  side  matches 
(properly  describes)  an  input  phonetic  sequence,  the 
transformation  described  by  the  left  side  is  performed  on 
the  input  sequence.  The  conditional,  if  present,  describes 
additional  criteria  that  must  be  met  for  the  transformation 
to  be  made.  For  example,  a simple  rule  for  chanqinq  a T to 
a flap  (OX)  is 


$ FLAP  DX= VOW  EL/T/VOW  EL  IF  STRESS*  1 GR  STR  ESS*  3; 

The  rule  name  is  FLAP.  The  riqht  side  is  VOWEL/T/VO WEL  and 
describes  a sequence  of  any  vowel  followed  by  T followed  by 
any  vowel.  The  left  side  is  DX,  and  the  associited 
transformation  is  to  substitute  DX  for  the  sequence  between 
the  / pair,  in  this  case  for  7.  The  conditional  is  IF 
STRESS*  1 GR  STRESSS3  and  it  means  that  the  transformation 
should  be  done  only  if  the  stress  level  of  the  first  phone 
(first  vowel)  in  the  matched  sequence  is  greater  than  the 

stress  level  of  the  third  phone  (second  vowel)  in  the 
matched  sequence.  For  example,  the  input  sequence 

IH: 2 T IY:  1 


would  be  transformed  by  FLAP  to 
IH : 2 DX  I Y : 1 


However,  the  foliowinq  sequences  would  not  be  t ransfortned : 

IH:  2 R T IY:  1 
I H : 1 T I Y :2 


The  first  sequence  is  not  matched  by  the  riqht  side  because 
of  the  presence  of  R.  The  second  sequence  is  matched  by  the 
riqht  side  of  FLAP  but  fails  the  conditional  test  because 
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the  stress  level  of  the  first  vowel  is  not  greater  than  the 
stress  level  of  the  second  vowel. 

The  following  subsections  describe  the  right  side,  left 
side,  and  conditional  parts  of  Lules  and  present  some 
examples. 


RIGHT  SIDE  OF  RULES 


The  riqht  side  of  a rule  describes  phonetic  sequences  and  is 
therefore  a pattern.  This  pattern  consists  of  three  parts: 
a left  context,  a nucleus,  and  a riqht  context.  Any  of  the 
three  parts  may  be  vacuous.  However,  at  least  one  of  the 
three,  parts  must  not  be  vacuous.  The  nucleus  part  of  the 
riqht  side  matches  the  portion  of  the  input  sequence  that  is 
affected  by  the  transformation  performed  by  this  rule.  The 
left  and  riqht  contexts  specify  the  necessary  environment  in 
which  the  nucleus  is  to  occur.  Normally,  the  nucleus  is 
delimited  by  a / pair.  If  the  pair  is  not  present,  then  *-he 
whole  riqht  side  is  assumed  to  be  the  nucleus  and  the  left 
and  riqht  contexts  are  assumed  to  be  vacuous.  The  following 
paragraphs  describe  the  constituents  (<right- part>s)  that 
make  up  the  nucleus  and  the  left  and  right  contexts. 


Phone  Name 


The  name  of  a phone  may  be  used  to  specify  the  occurrence  of 
that  phone  in  the  input  strinq.  For  example,  m means  the 
occurrence  of  M in  the  input  string,  and  IY  means  the 
occurrence  of  IY  in  the  input  string  (with  any  stress 
level).  If  it  is  desired  to  restrict  the  stress  level  of  a 
vowel  to  a particular  value,  then  follow  the  vowel  name  with 


a colon  and  an  explicit  stress  level.  Thus,  to  specify  the 
occurrence  of  an  IH  with  2 stress,  write  IH:2. 


3.  1.2 


55 ingle  Feature  Specification 


The  occur ren  je  of  any  phone  with  a specific  feature  may  be 
specified.  For  instance,  BOUNDARY  may  be  used  to  specify 
the  occurrence  of  either  * or  f.  Similarly,  CONST  specifies 
the  occurrence  of  any  consonant,  and  VOWEL  specifies  the 
occurrence  of  any  vowel,  in  a like  manner,  the  values  of 


CLASS  and  PLACE  (such  as 
occurrence  of  M,  N,  or  NX, 


NASAL,  which  specifies  the 
or  VELAR,  which  specifies  the 


occurrence  of  either  NX,  K,  or  G)  may  be  used.  Also,  VOICE 


may  be  used  to  specify  the 
To  specify  the  occurrence 
stress  level,  write  VOWEL 
explicit  stress  level.  t 


occurrence  of 
of  any  vowel 
followed  by 
or  example. 


a voiced  phone, 
with  a specific 
a colon  and  an 
to  specify  the 
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occurrence  of  a reduced  vowel,  write  VOWEL:0.  Any  of  the 
described  specifications  may  be  optionally  preceded  by  ♦ or 
Plus  (♦)  merely  emphasizes  that  the  specification 

followinq  is  necessary;  the  ♦ is  therefore  lust  window 
dressinq.  Minus  (-) , on  the  other  hand,  leans  that  the 
specification  followinq  must  not  occur.  Thus,  -VOICE 
specifies  the  occurrence  of  any  unvoiced  phone,  -VELAR 
specifies  the  occurrence  of  any  phone  except  NX,  K,  or  G , 
and  -R  specifies  the  occurrence  of  any  phone  except  R. 


3.1.3  Multiple  feature  Specification 

The  occurrence  of  a phone  that  siaul taneousl y possesses 
several  features  aay  be  specif/.ed  fcy  a <f eature~bundle>.  A 
feature  bundle  is  represented  as  a sequence  of  feature 
specifications  (includinq  phone  napes,  choices  and  other 
feature  bundles)  enclosed  in  pa  *-c>ntheses.  The  included 
specifications  aay  be  preceded  Ly  ♦ or  -.  For  exanple, 

(FRIC  LABIAL)  and  ( FRIC+LABI AL) 

both  specify  the  occurrence  of  a labial  fricative,  i.o., 
either  F or  V.  The  feature  bundles 

(FRIC-LABI AL)  and  ( ♦FR IC- LAB  I A L) 

both  specify  the  occurrence  of  a non-labial  fricative,  i.  e.  , 
TH,  S,  SH,  DH,  Z,  or  ZH.  An  exanple  of  a nested  feature 
bundle  is 

(FRIC-  (LABI  AL  + VOICE)  ) 

This  specifies  the  occurrence  of  any  fricative  that  is  not 
both  voiced  and  labial.  It  could  have  been  written  more 
simply  as 

(FRIC  -V) 


3.1.4  Choice  Specification 

The  occurrence  of  a phone  in  the  input  sequence  may  be 
specified  as  a <choice>  aaonq  several  specifications.  The 
individual  choice  specifications  nay  be  phone  names,  a 
feature,  or  a feature  bundle.,  The  choices  are  separated  by 
OR.  For  example,  the  choice 

NASAL  OR  (PLOS* VOICE) 

specifies  the  occurrence  of  a nasal  or  a voiced  plosive. 
Thus,  it  would  aatch  any  of  N,  M , NX,  B,  1,  or  D.  For 
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another  example,  the  choice 
GLIDE  OtB  OB  L 

specifies  the  occurrence  of  any  of  Y,  W,  R,  or  I.  The 
equivalent  of  an  AMD  operator  is  provided  throuqh  the 
feature  bundle  mechanism. 

3.1.5  Specif  ication_pf_epticnal_Oc£U£rences 

The  optional  occurrence  of  a phone  in  the  input  strinq  can 
be  specified  by  an  <optional>  phrase.  The  word  OPT  is 
followed  by  a phone  name,  feature,  feature  bundle,  or 
choice.  For  exaaple,  the  specification  sequence 

VOWEL,  OPT  B,  T 

would  natch  both  of  the  followinq  input  strinqs: 

IY  R T and  IY  T 

If  a phone  is  detected  in  the  input  strinq  that  Batches  the 
specification  of  an  optional  phrase,  then  it  is  passed  over 
before  a natchinq  of  the  rest  of  the  input  strinq  to  the 
rest  of  the  pattern  is  attempted.  This  neans  that  there  is 
no  automatic  backup.  To  illustrate  this,  the  pattern 
sequence 

VOWEL,  OPT  VOICE,  R 
would  match  the  input  strinq 
OW  AX  R 

but  would  not  natch  the  input  strinq 
OW  R D 

In  fact,  the  above  pattern  sequence  would  natch  nothinq  that 
did  not  also  match  the  pattern  sequence 

VOWEL,  VOICE,  R 


3.1.6  Specification  of  Repeated. Occurrences 

The  repoa  d occurrence  of  phones  that  match  a particular 
set  of  cr  eria  may  be  specified  by  a <repeat>  phrase.  The 
phrase  is  introduced  by  the  word  BEP  followed  by  th®  minimum 
acceptable  number  of  occurrences  (a  non-neqative  integer) 
and  the  match  criterion.  The  match  criterion  may  be  a phone 
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name,  feature,  feature  bundle,  or  choice.  For  example,  a 
pattern  that  matches  any  monosyllabic  word  is 

#,  HEP  0 CONST,  VO  BEL , REP  0 CCNST,  # 

In  the  above,  the  input  sequence  is  specified  to  include  a 
teqinninq  word  boundary  (I)  followed  by  zero  or  more 
consonants,  a vowel  (at  any  stress  level),  zero  or  more 
trailinq  consonants,  and  an  endinq  word  boundary.  In  the 
next  example,  the  pattern  sequence  will  natch  the  beginning 
consonant  cluster  and  vowel  in  a syllable  whose  initial 
cluster  contains  at  least  two  phones. 

BOUNDARY,  REP  2 CCNST,  VOWEL 

Repeat  phrases,  like  optional  phrases,  do  no  backup.  The 
repeat  matches  as  many  phones  in  the  input  strinq  as 
possible.  (If  at  least  the  speciiied  minimum  number  of 
occurrences  are  found,  then  matching  of  the  rest  of  the 
input  strinq  to  the  rest  of  the  right  side  continues.)  For 
example,  the  pattern  sequence 

REP  0 CONST,  P 

would  not  match  anytninq  because  R is  a CONST,  and  as  such 
would  te  passed  over  by  the  repeat.  This  may  be  remedied  by 
rewriting  the  pattern  sequence  as 

HEP  0 (CONST-R),  P 

This  second  pattern  sequence  does  the  -job  because  in  English 
two  Ps  can  not  occur  in  the  same  consonant  strinq  unless 
separated  by  a syllable  or  word  boundary. 


3 * 1 • 7 Examples _of_Complete  Right  Side  Patterns 

The  complete  riqht  side  of  a rule  consists  of  a nucleus  and 
a left  and  a riqht  context.  Each  of  these  constituents  of 
the  right  side  comprises  a sequence  of  phone  names, 
features,  feature  bundles,  choices,  optional  phrases,  and 
repeat  phrases.  The  nucleus  is  normally  delimited  by  a / 
pair.  The  nucleus  is  the  portion  that  will  be  replaced  if 
the  rule  applies.  The  members  of  the  sequence  are  separated 
from  cich  ether  by  commas.  Tf  a / separates  two 
specifications,  then  a comma  should  not  be  used. 

VOWSL/T/VCWEL 

The  left.  context  is  the  one-element  sequence,  VOWEL.  The 
nucleus  is  the  one-element  sequence,  T.  The  riqht  context 
is  the  one-element  sequence,  VOWEL.  It  a rule  with  this 
riqht  side  matches  a pattern,  then  the  nucleus  (T)  would  be 
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replaced  by  the  sequence  qenerated  by  the  rule’s  left  side. 
/D  OB  T,  BOUNDARY/Y 

In  this  example,  the  left  context  is  vacuous,  and  the  right 
context  is  the  one-phone  sequence,  Y.  The  nucleus  is  the 
two-element  sequence  D OB  T,  BOONDABY.  The  nucleus  matches 
any  one  of  these  four  input  sequences: 

D * , D #,  T *,  and  T # 

VOWEL,  NASAL  or  /VOWEL,  BASAL/ 

Both  the  left  and  riqht  contexts  are  vacuous  in  these  two 
equivalent  pattern  sequences.  The  nucleus  is  the 
two-element  sequence  VOWEL,  NASAL.  These  two  examples 
illustrates  the  point  that  if  the  / pair  is  omitted  from  the 
rule's  riqht  side,  then  the  entire  riqht  side  is  the 
nucleus. 

NASAI//P10S 

In  this  example,  the  nucleus  is  vacuous.  The  / pair  merely 
marks  a place  at  which  the  left  side  can  insert  a phone 
string  if  the  rule  matches. 


3.1.8  Indices  of  Right  Side  Parts 

Components  of  a conditional  phrase  and  a rule's  left  side 
can  reference  features  of  the  phones  that  were  matched  by 
the  rule's  riqht  side.  The  referenced  phone  is  specified  by 
£ followed  by  a strictly  positive  integer.  Each  riqht  part 
in  the  riqht  side  is  assiqned  an  index  number  starting  from 
one.  For  example,  in  the  pattern  sequence 

VOWEL, OPT  BOUNDARY/II  OR  IH,REP  1 LABIAL/(PLOS-VELAR)  , # 


there  are  six  riqht  parts: 

1 VOWEL, 

2 OPT  BOUNDARY, 

3 IY  OR  IH, 

4 REP  1 LABIAL, 

5 (FLOS-VELAR)  , and 

6 * 


Thouqh  optional  or  repeat  phrases  are  assiqned  index 
numbers,  the  features  of  the  phones  they  match  may  not  be 
references  because  it  is  indeterminate  whether  they  matched 
anythinq  at  all  and,  if  so,  how  many  phones  were  matched. 
Example  uses  of  indexed  references  and  their  meanings  are: 


NAfiEdl 


name  of  phone  matched  by  the  first  right  part 
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KI  ND®2 
PLACE®! 
CLASS® 2 

;cice®i 
- VOICE®3 
STRESS®! 


kind  of  phone  matched  by  the  second  right  part 
place  of  phone  matched  by  the  third  right  part 
class  of  phone  matched  by  the  second  riqht  part 
voicing  of  phone  matched  by  the  first  right 
pa  rt 

inverse  of  the  voicinq  of  phot.e  matched  by  the 
third  riqht  part 

stress  level  of  phone  matched  by  the  first 
right  part 


3.2  CONDITIONAL  PART  OP  ROLES 

Use  of  a <conditional>  with  a rule  is  optional.  If  the 
<condit ional>  is  omitted,  the  only  criterion  for  a rule's 
matchinq  an  input  strinq  is  that  the  riqht  side  of  the  rule 
properly  describe  (match)  the  strinq.  If  a conditional 
phrase  is  used,  it  presents  additional  criteria  that  must 
also  be  satisfied  for  the  rule  to  match  the  input  strinq. 

The  form  of  a conditional  is  the  word  IF  followed  by  the 
body  of  the  conditional.  The  body  is  a series  of 
relationships  separated  by  the  word  AND  or  OR  (inclusive). 
AND  binds  tiqhter  than  OR.  Thus,  if  rl,  r2,  and  r!  are 
relations,  then  the  meaning  of 

rl  OR  r2  AND  r3  is  rl  OR  (r2  AND  r3) 

To  overcome  the  normal  bindinq  scheme,  parentheses  may  be 
used  to  explicitly  qroup  the  relations  and  operators.  For 
instance,  to  achieve  the  other  interpretation  of  the  above 
example,  ' cite 

(rl  Oh  r 2)  AND  r3 

Relations  may  either  test  a feature  of  a single  phone  or 
compare  the  features  of  two  phones.  Tests  are  usually 
indicated  by  usinq  one  of  the  operators  EOr  NQ,  GQ,  LQ,  GK, 
and  LS.  An  example  of  a relation  is 

PI  ACE® 1 NO  PL  ACE® 3 

which  is  satisfied  if  the  place  of  articulation  of  the  phone 
matchinq  the  first  rir'bt  part  is  not  equal  to  the  place  of 
articulation  of  the  p.one  matchinq  the  third  riqht  part. 
(See  section  3.1.8  for  an  elaboration  on  the  meaninq  of 
indexed  references  to  riqht  parts  such  as  ®1  and  ®3  in  this 
example.)  Another  example  of  a relation  is 

C LAS  S®  2 EC  FRIC 

which  is  satisfied  if  the  class  of  the  phone  that  matched 


' 
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the  second  riqht  part  is  FRIC.  At  first,  this  may  seem 
unnecessary-  Could  not  the  second  riqht  part  just  have  been 
written  FRIC  and  the  relation  not  used?  To  answer  this 
question,  consider  this  example  riqht  side  and  conditional: 

CONST, PLOS  OR  FRIC  IF  V0ICE31  OR  CLASS32  EQ  FRIC 


Toqether,  the  riqht  part  and  conditional  match  a two-phone 
sequence  if  either  the  first  phone  is  voiced  and  the  second 
phone  is  a plosive  or  fricative,  or  if  the  first  phone  is 
any  consonant  and  the  second  phore  is  a fricative.  To  write 
such  complex  matchinq  criteria  as  this,  it  is  necessary  to 
have  conditionals  and  to  be  able  to  write  tests  aqarnst 
constants. 

Relations  that  test  a phone's  kind,  class,  place,  and  name 
may  be  written  xn  one  of  two  ways  as  demonstrated  by  the 
above  examples.  T " *-k~  * •»  5 ~ * a c--  ~ 


the  first 


indexed  feature 


cateqory  is  compared  by  the  operator  EQ  or  NQ  to  a constant 
value  in  that  feature  cateqory.  Examples  are: 


KIND34  EQ  VOWEL 
CLASS31  NQ  PLCS 
PL  AC  Ed  2 EQ  VELAR 
NA«En>3  NQ  IY 

The  second  method  of  comparison  matches  the  feature  values 
of  two  different  phones.  Examples  are: 

KIND®3  NQ  KINDdl 
CLASS®  1 EQ  CLASS32 
PLACE®  3 NQ  PLACE® 2 
N A M E d) 2 E0  NA«E<51 

Relations  involvinq  stjess  level  may  be  made  in  a similar 
manner.  In  addition,  the  operators  GQ,  LQ,  Gil,  and  LS  may 
be  used.  Some  examples  are: 

STRESS32  GR  0 

STR ESS® 1 LS  STR ESS® 3 


The  possible  constant 

2. 


values  of  stress  level  are  0,  1,  and 


Since  there  are  no  symbols  for  constant  values  of  voicing, 
the  sinqle-phone  tests  are  written  as  in  these  examples: 

VOICE32 
-VOICE®  3 

Comparisons  between  the  voicinq  of  two  phones  are  written  as 
in  these  examples: 
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V0ICE?2  EQ  VOICE33 
V0ICES2  NQ  VOICE®  1 

Only  the  operators  EQ  and  NQ  nay  be  used  in  voicing 
comparisons. 


3.3  LEFT  SIDE  OF  RULES 

The  <left-side>  of  a rule  specifies  the  sequence  of  phones 
that  is  to  replace  the  sequence  of  phones  that  was  matched 
by  the  nucleus  part  of  the  rule's  riqht  side.  If  the 
sequence  to  be  substituted  is  vacuous,  then  the  left  side  is 
NIL.  For  example, 

$ DEGEM  NIL=/CONST/OPT  BOUN DARY, CONST 
IF  NAME® 1 EQ  NAME33; 

This  is  the  version  of  the  standard  degemination  rule  that 
removes  the  first  consonant  of  a doubled  pair  whether  or  not 
they  are  separated  by  a word  or  syllable  boundary.  DEGEM 
produces  the  following  transformations: 


T T 

to 

T 

S*S 

to 

*S 

M*N 

to 

# M 

If  the  sequence  to  be  substituted  is  net  vacuous,  then  it  is 
represented  by  a sequence  of  <left-part>s  separated  by 
commas.  (In  a prior  version  of  the  system,  a rule  could 
have  multiple  sequences  of  left  parts.  See  f2].)  The 
allowed  kinds  of  left  parts  are  consonant.  names,  bouniary 
names,  vowel  specifiers,  and  constructed  consonants.  The 
following  paragraphs  describe  the  different  kinds  ot  left 
parts  and  present  some  examples  of  complete  rules. 


3.3. 1 Consonant  and  Boundary  Name  Left  Parts 

A consonant  or  boundary  name  may  be  used  as  a left  part. 
For  example,  in  the  rule 

$ FLAP  DX=  V OW  EL/T  OR  D/VOWEL 

IF  STRESS81  GR  STRESS33 

DX  is  a left  part.  It  is  substituted  for  an  intervocalic  T 
or  D whenever  the  stress  level  of  the  first  vowel  exceeds 
that  of  the  second  vowel. 

Tn  addition,  consonants  and  boundaries  may  be  speciried  as 
left  parts  by  use  of  index  numbers.  For  example. 
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$ POO  3,  2=  VOWEL/BOONDARY,  R OB  L/VOUEI; 

In  this  example,  the  two  phones  Hatching  the  nucleus  are 
t\ ansposed.  The  index  3 references  the  phone  R or  L,  and 
tae  index  2 references  the  phone  * or  # that  matched 
BOUNDARY.  Two  transformations  that  would  be  produced  by 
this  rule  are 

A H*  R IY  to  AH  R*IY 
AXIL  UW  to  AX  L#UW 

Index  references  are  restricted  to  phones  that  are  matched 
by  the  nqht  side.  Thus,  it  is  illeqal  to  reference,  say, 
the  first  phone  followinq  the  strinq  aatched  by  the  pattern. 


3.3.2  Vowel  Specification  Left  Parts 

The  specification  of  a vowel  in  the  reconstruction  sequence 
aid  y be  accomplished  in  a variety  of  ways.  The  vowel  name 
and  the  stress  level  may  be  qiven  explicitly,  the  name 
and/or  stress  level  may  be  borrowed  using  indices,  or  the 
stress  level  may  be  borrowed  iaplicitly.  The  various 
techniques  are  demonstrated  by  the  followinq  examples. 

i R1  I 1=/VOWEL/OPT  BOUNDARY, N; 

In  this  example,  an  1H  with  stress  level  one  is  substituted 
for  the  vowel  that  matched  the  first  riqh  t part. 

i H2  I H i 1 =/ VO  W EL/OP  T BOUNDARY,  N; 

like  the  above  example,  IH  is  substituted  for  the  vowel  that 
matched  the  first  right  part.  However,  the  stress  level  is 
borroved  from  the  oriqinal  vowel  by  SI.  Thus,  if  the  input 
strinq  were  IY:  0*N  then  rule  R1  produces  IH:1*N  and  rule  R2 
produces  IH:0*N. 

i RJ  1 ,R=/VOWEL,*,ER:0/; 

As  with  consonant  and  boundary  names,  vowel  names  may  be 
referenced  by  an  index.  In  this  example,  t.  he  left  part  1 is 
whatever  vowel  (and  its  stress  level)  that  is  aatched  by  the 
first  right  part.  Thus,  R3  would  transform  the  input  string 
AH:2*ER:0  to  AH: 2 R. 

$ R 4 1*3,R=VOWEL,*,ER; 

This  example  is  like  R3  except  that  only  the  vowel  name  is 
borrowed  from  the  phone  aatchinq  the  first  right  part.  The 
stress  level  is  borrowed  from  the  ER  that  matches  the  third 
right  part  (by  the  *3) . Thus,  P4  would  transform  the  input 
string  AH:2*ER:0  to  AH:  0 R. 
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$ R5  I H=/I Y OP  EH/N; 

In  this  example,  only  a vowel  name  (IH)  is  qiven  as  a left 
part.  When  this  form  of  left  part  is  used,  the  stress  level 
is  borrowed  from  the  phone  that  matches  the  first  vowel 
specification  in  the  nucleus.  These  are  two  transformations 
that  result  from  the  application  of  rule  R5 : 

IY:  1 N to  IH:  1 N 
EH :0  N to  IH :0  N 

Another  example  of  implicit  stress  borrowinq  is  rule  R6: 

$ R 6 ER=VOWEL,R ,* , VOWEL; 

R6  transforms  the  input  strinq  EH:2  R*AX  to  ER:2  because  the 
stress  level  is  borrowed  from  the  first  vowel. 

Two  thinqs  should  be  noted  in  usinq  vo wel-specifyinq  left 
parts:  (1)  it  is  illeqal  to  write  such  thinqs  as  AXa>3  or 

AX:0  because  AX  is  automatically  qiven  a stress  of  zero,  and 
(2)  when  an  implicit  stress  level  is  borrowed,  it  is  never 
ta  en  from  a vowel  that  was  matched  by  a repeat  or  optional 
phrase;  it  is  borrowed  from  the  first  other  right  part  in 
the  nucleus  that  specifies  a vowel  (if  there  is  a choice, 
then  each  choice  must  specify  a vowel). 


3.3.3  Constructed  gouson ants 

Some  consonant  phones  may  be  constructed  by  specifyinq  their 
features.  Specifically,  the  class,  place  of  articulation, 
and  voicing  must  be  specified,  in  that,  order,  and  enclosed 
tv  parentheses.  Some  examples  of  constructed  consonants 
a re : 

(NASAL  ALVEOLAR  VOICE) 

(PLOS  PL  AC  Ed)  2 -VOICE) 

(CLAS5d)3  LABIAL  VOICE.-#  1) 

The  first  example  is  equivalent  to  havinq  written  N.  The 
class  specification  may  be  either  a class  name  or  the  word 
CLASS  followed  by  a and  an  index,.  Examples  from  the  above 
are  NASAL,  PLOS,  and  CLASSa)3.  In  the  last  case,  the  class 
of  the  constructed  phone  is  made  the  same  as  the  class  of 
the  chono  that  matched  the  third  right  part.  in  a similar 
manner,  th”  place  of  articulation  may  be  either  a place  name 
or  the  word  PLACE  followed  by  a and  an  index.  Examples  from 
the  abo/e  are  ALVEOLAR,  PLACES?,  and  LABIAL.  Voicing  of  a 
constructed  consonant  is  specified  by  either  VOICE  or  -VOICE 
witn  the  obvious  moaninqs,  or  by  use  of  a borrowed  voicing, 
e.q.,  V0ICE(d3  or  -VOICE32.  An  example  of  a rule  that  uses  a 
constructed  consonant  is: 


1 


- 
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$ JHA  2,  ( AFHIC  ALV  PAL  V0ICEd1)=T  OP  D, BOUNDARY, Y 
This  rule  vouid  make  transformations  such  as 


D*  Y to  * JH 
TIY  to  #CH 


A caution  should  be  observed  when  usinq  constructed 
consonant  forms  — namely,  that  there  exists  a phone  with 
the  specified  class,  place  of  articulation,  and  voicinq. 
Because  ot  this,  it  is  illeqal  to  construct  a consonant  in 
the  class  NTSC.  It  ;s  the  user's  responsibility  to  quard 
aqainst  the  qeneration  of  illeqal  phones.  The  system  does 
little  run-time  checkinq. 


L EX I CO  N_ AN D_  S U E - 1 E X.ICQfl  DEFINITION 


This  section  describes  the  commands  that  are  used  to  define 
lexicon  entries  and  fora  sub-lexicons.  Appendix  1 qives  a 
formal  syntax  description  of  these  forms  (<lexicon>  and 
<sub-lexicon>) . 


4.1 


DEFINITION  OF  LEXICON  ENTRIES 


There  are  three  basic  forms  of  the  lexicon  command:  (1) 

add,  replace,  or  modify  a lexicon  entry;  (2)  auqnent  a 
lexicon  entry;  and  (3)  print  out  a lexicon  entry.  All 
lexicon  commands  beqin  with  the  word  LEX  and  end  with  a 
semicolon.  A lexicon  entry  is  identified  by  a word,  e. q. , 
identifier  such  as  HELLO  or  ONE, TWO. THREE . Associated 


an 


with  each  word  in  the  lexicon  are  one  or  more  APPAbet 
spellinqs.  Each  of  these  spellinqs  is  called  a lexical  base 
form  or,  more  simply,  -just  a form. 


The  basic  command  that  adds  a new 
lexicon  is  the  word  L3X  followed 
spelled  in  ARPAbet  (as  described 
a phoncloqical  spellinq  of  the 
comman  d: 


word  and  its  forms  to  the 
by  the  word  and  the  forms 
in  Section  2.4).  To  enter 
word  TOTAL,  input  this 


a 


J 

f t 

I 

J 


I 

I 


i 


II 


* | 


LEX  TOTAL  (T  0 H:  2 T*AX  L)  ; 

Recall  that  the  exterior  word  boundaries  are  automa t ical ly 
added  so  that  the  actual  spellinq  is 

#T  OW: 2 T*A X L# 

It  it  is  desired  to  enter  the  word  T3TAL  with  two  forms, 
then  the  forms  are  separated  by  commas;  for  example: 
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LEX  TOTAL  (T  0B:2*T  AX  I.)  , (T  OK:2  T*UH:0  L)  ; 

Bith  either  of  the  above  examples,  if  TOTAL  was  already  in 
the  lexicon,  all  existing  forms  would  be  deleted,  and  the 
new  definition  would  comprise  the  entire  set  of  forms  for 
this  word. 

Various  commands  allow  the  forms  to  be  referenced 
individually.  The  lanquaqe  mechanism  is  the  word  followed 
by  a colon  and  an  index  number.  Thus,  given  the  second 
definition  above  of  TOTAL, 

TCTAL:1  is  #T  GW:2  T*AX  L# 

and 

TOTAL: 2 is  #T  01 : t T*UH:0  LI 

It  is  also  possible  to  selectively  alter  the  definition  of  a 
particular  fora  as  opposed  to  redefininq  the  whole  entry 
For  example,  after 

LEX  TOTAL : 2 (T  0W:2  T*AH:0  L)  ; 

TOT AL : 1 is  unaffected  but 

TOTAL: 2 is  now  #T  01:2  T*AH:0  LI 

When  usinq  this  form  of  the  lexicon  command,  the 
reference  an  existinq  form  or  be  one  qreater  than 
of  forms  currently  in  existence.  In  the  latter 
new  form  is  added  to  the  lexicon  entrv. 

New  forms  may  easily  be  added  to  the  lexicon  entry.  Assume 
that  the  command 

LEX  CUP  (K  Ah  P)  ; 

has  been  executed,  and  then  the  command 
LEX  CUP*  (K  AH  B)  , (K  IJH  P)  ; 

is  entered.  There  are  now  three  forms  of  the  word  CUP: 

CUP:  1 is  #K  AH: 1 Pt 
CUP: 2 is  IK  AH:1  E# 

CUP: 3 is  IK  UH: 1 PI 

Thus,  new  forms  are  added  by  usinq  ♦ followed  by  one  or  more 
AFPAbet  spellinqs. 

The  lexicon  command  is  also  used  to  output  forms  to  the 
user's  terminal.  Given  the  above  definition  of  CUP,  the 
command 


index  must 
the  number 
case,  the 
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LEX  CUP; 

would  output  all  three  spellinqs.  The  coaaand 
LEX  COP:  2; 

would  only  output  the  spelling  of  CUP:2. 


4.2  DEFINITION  OF  SUB-LEXICONS 

A sub-lexicon  is  defined  by  a <sub-lexicon>  command.  The 
format  of  the  command  is  the  word  SLEX  followed  by  the 
sub-lexicon  name  (an  identifier)#  and  the  constituents 
separated  by  comaas.  For  example, 

SLEX  EXAMPLE  TOTAL,  CUP:2; 

The  sut-lexicor.  EXAMPLE  is  defined  to  contain  all  the  forms 
of  the  word  TOTAL  but  only  the  second  form  of  the  word  CUP. 
A word  or  its  forms  nay  appear  in  any  number  of  sub-lexicons 

A sub- lexicon  definition  is  maintained  in  symbolic  form. 
Therefore,  the  actual  forms  that  constitute  the  sub- lexicon 
are  those  in  existence  when  the  sub— lexicon  is  referenced 
not.  necessarily  the  sane  as  those  in  existence  when  the 
sub- lexicon  was  defined. 


5.  8ULE_A££LXI-NG_SyBRS 

When  a rule  is  operated  on  a phonetic  input  strinq,  it  is 
usually  desired  to  try  it  on  all  substrings,  not  just  tne 
input  as  a whole.  Therefore,  qiven  the  rule, 

$ FLAP  DX  = VOWEL,OPT  BOUNDAR Y/T  CR  D/OPT  BOUNDARY , V GW  EL; 

the  desired  transformation  of 

#WH  AH: 1 TlAXiD  EY:2#  is  »«H  AH:  1 DX#AXIDY  EY: 2# 

Thus,  it  is  necessary  to  define  the  algorithm  by  which  a 
rule  is  tried  on  substrings  of  the  input. 

Also,  it  is  usual  to  operate  rules  iu  groups.  Such  a qroup 
is  called  a rule  set.  At  issue  is  the  method  of  defininq 
the  constituents  of  a rule  set  and  the  order  dependencies  of 
the  set  members.  The  system  provides  three  methods  of 
specification:  (1)  ordered  rule  sets  — normally  used  with 

"obligatory"  rules,  (2)  unordered  rule  sets  --  normally  used 
with  "optional"  rules,  ant?  (3)  nondeter ministic  rule  se^s 
normally  used  for  fun.  All  three  types  are  defined  by 
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<subr>  commands. 


The  follow inq  subsections  describe  substring  selection 
algorithms  and  the  methods  of  defining  rule-applying  subrs. 


SUBSTRING  SELECTION 


An  input  string  is  a phonoloqical  spelling  of  a word  or  a 
sequence  of  words.  It  is  unusual  that  a rule  will  match  (or 
was  intended  to  match)  the  entire  input  strinq.  Therefore, 
the  rule  set  appliers  must  select  substrings  as  possible 
candidates  on  which  to  try  the  rules.  By  example,  the 
possible  substrings  of 


#K  AA: 2 N#D  UW: 1# 


*K  ah:2  NtD  UW.1# 
K AH:  2 N#D  UW:  1# 
AH:2  N#D  UW : 1 * 

N # D UW: 1# 

#D  u w : 1 # 

UK:  1# 

# 


Because  neither  optional  nor  repeat  phrase  (in  the  right 
side  of  rules)  perform  backtracking  and  because  rules  permit 
arbitrary  parts  of  the  input  strinq  to  exist  to  the  right  of 
the  substrinq  matched  by  the  riqht  part,  the  above  set  of 
suostrinqs  is  sufficient.  With  the  different  types  of 
rrle-applyinq  subrs,  the  interaction  (ordering)  of  members 
of  the  rule  set  with  substrings  of  the  input  and  substrings 
of  the  derived  strings  may  differ  as  described  below. 


ORDERED  RULE  SUERS 


An  ordered  ru le-apply inq  subr  is  defined  by  the  word  GUBP 
followed  by  its  name  (an  identifier)  and  a sequence  of 
<subr-part>s  separated  by  commas  and  terminated  with  a 
semicolon.  The  word  OKDFaED  may  follow  the  word  SUBR  but  is 
not  necessary  — ordered  subrs  are  the  default.  In 
operation,  the  first  subr  part  is  applied  to  each  substrinq 
of  the  input  in  turn,  left  to  right.  Then  the  second  subr 
part  is  applied  to  each  substring,  etc.  Tf  a rule  in  a subr 
part  matches  the  input  strinq,  it  is  immediately  transformed 
by  the  rule,  and  the  rest  of  the  processing  continues  at  the 
same  phone  position  in  the  derived  strinq.  T'.us,  the  subr 
parts  and  the  rules  they  comprise  are  treated  as  obligatory 
transformations.  The  algorithm  is  presented  symbolically  in 


E 
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] 

J 

DEFINE  RUNRULE(SUPRPARTLIST, ARPASPELL) 

DO  SPELL:=ARPAS  PELL; 

L SPELL : =L ENG TH ( SPELL)  ; 

FOR  SUERPART  IN  SOBPPARTLIST 
DO  FOR  I:  =1  STEP  1 ONTIL  I>LSPELL 
DC  SOBRPABT  (SPELL,  I)  ; 

END  FOR; 

END  FOR; 

END  RtlNPULE; 


Fiqure  1 

ORDERED  RULE  APPLICATION  ALGORITHM 


Fiqure  1.  Application  of  a subr  part  SUBRP  ART  { SPELL,  I)  , 
operates  on  the  substrinq  startinq  at.  the  Tth  phone 
position.  As  a side  effect,  the  values  of  the  variables 
SPELL  and  LSPELL  (spellinq  lenqth)  nay  be  altered. 

A subr  part  nay  be  a rule  lane  or  a oneof,  allof,  if,  or 
unless  phrase.  A rule  naae  used  as  a subr  part  Beans  simply 
operate  the  rule  at  the  proper  times.  The  followinq 
oaraqraphs  describe  the  other  kinds  of  forms  that  may  be 
used  as  subr  parts. 


5.2.1  Oneof  Subr  Parts 

A oneof  subr  part  is  introduced  by  the  word  ONEOF  followed 
by  a parenthesized  list  of  one  or  more  subr  parts  separated 
by  conmas.  For  example, 

ONEOF  (R  1,R  2) 

The  rule  names  R1  and  R2  are  the  embedded  subr  parts.  A 
oneof  phrase  runs  its  embedded  subr  parts  (in  the  left  to 
riqht  order  of  their  appearance)  on  the  currently  visible 
substrinq.  It  any  rule  in  a subr  part  matches  the  input 
substrinq,  then  after  the  completion  of  the  operation  of 
that  subr  part,  the  rest  of  the  oneof  phrase  in  which  it  is 
embedded  is  skipped.  Therefore,  in  the  above  example,  if  R1 
matches  the  input  substrinq,  then  R2  is  not  operated. 


r> . 2 . 2 A 1 lo  f _ Su  b e p a r t s 


An  allof  subr  part  is  introduced  by  the  word  ALLOF  followed 
by  a parenthesized  list  of  one  or  more  subr  parts  separated 
by  commas.  For  example. 
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AILOP  (ONEOP(R1,32)  , R3,  R4) 

The  embedded  subr  parts  are  the  oneof  phrase  ON  EO F (R  1 , R 2) 
and  the  rule  names  R3  and  R4.  An  allof  phrase  operates  the 
embedded  subr  parts  in  the  specified  order,  left  to  nqht. 
In  the  above  example,  R1  is  operated;  if  it  matches  the 
input  substring,  then  R3  and  R4  are  operated  on  the 
transformed  strinq.  Otherwise,  rules  R2,  R3,  and  R4  are 
operated  in  that  order.  Recall  that  all  embedded  subr  parts 
are  run  on  the  substrinq  startinq  at  the  same  phone 
position.  Therefore,  caution  should  should  be  exercised  to 
ensure  compatibility  of  operation  with  your  oriqinal 
intentions. 


5-2.3  If  and  Unless  -PJLfa  ses 

If  and  unless  suf  r parts  provide  for  standard  if-then  and 
if  - then-else  control  logic.  An  if  phrase  is  introduced  by 
the  word  IF  followed  by  a subr  part,  the  word  THEN,  another 
subr  part,  and  optionally  the  word  ELSE  followed  by  yet 
another  subr  part.  Unless  phrases  are  identical  in  format 
to  if  phrases  except  that  they  are  introduced  by  the  word 
UNLESS  instead  of  the  word  IF.  If  si,  s2,  and  s3  are  subr 
parts,  then  the  possible  formats  are: 

IF  si  THEN  s2 

UNLESS  si  THEN  s2 

IF  si  THEN  s2  ELSP  s3 

UNLESS  si  THEN  s2  ELSE  s3 

For  the  first  format,  si  is  run.  If  any  rule  run  in  si 
matches  the  input,  then  s2  is  run.  Otherwise,  s2  is 
skipped.  For  the  second  format,  si  is  run.  If  na  rule  run 
in  si  matches  the  input,  then  s2  is  run.  Otherwise,  s2  is 
skipped.  For  the  third  format,  si  is  run.  if  any  rule  run 
in  si  matches  the  input,  then  s2  is  run.  Otherwise,  s3  is 
run.  For  the  fourth  format,  si  is  run.  If  any  rule  run  in 
si  matches  the  input,  then  s3  is  run.  Otherwise,  s2  is  run. 
All  applicable  embedded  suor  parts  (si  and  s2  or  s3)  are  run 
on  the  substrings  at  the  same  phone  position.  (Tf  si 
matches  the  input,  the  input  is  transformed  before  the 
operation  of  s2  or  s3.) 


b.3  UNORDERED  RULE  5UBr.S 

An  unordered  ru le-a p ply inq  subr  is  defined  by  the  word  SUBR 
followed  (y  the  word  UNORDERED,  its  name  (an  identifier), 
ari  the  n.ue  of  the  rules  in  the  rule  set  separated  by 
commas.  Fot  example: 
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DEFINE  RUNFULE  (R  U'.ESET,  ARPASPFLL) 

DO  LOOPER  (R  UL  5SET,  ARPASPE-L,  1 ) ; 

END  RUNRULE; 

DEFINE  LOOPER  (R  0L  ES ET  , ARP ASPELL,  I NDEX) 

DO  PRIN  T ( ARP  A SP ELL)  ; 

FOR  I : = I EX  ST  BP  1 UNTIL  I>L  ENGTH  ( ASP  ASPELL) 

DO  FOR  RULE  IN  RULESET 

DO  CH  ANGES  : = RUL  E ( ARP  ASPELL,  I)  ; 

IF  CHANGES 

THEN  DO  NEBSPELL:  = MAKECHANGS(  ARPASPELL, 

CHANGES) ; 

LOOPER (FULESET,NEWSPSLL, I) ; 

END; 

END  FOR; 

END  FOR; 

END  LCOPEF; 


Fiqure  2 

UNORDERED  RULE  APPLICATION  ALGORITHH 
Fiqure  2 


SUBS  UNOFDERED  XYZ  R1,R2,R3; 

XII  is  defined  as  an  unordered  rule-applying  subr  that 
operates  the  rule  set  that  consists  of  the  rules  Rl,  R2,  and 
R 3.  Fiqure  2 shows  the  rule-application  alqorithn.  When  a 
rule  is  run  on  an  input  strinq  (CH  A NGE : = RULE  (ARPASPELL,  I))# 
it  is  passed  two  arquaents;  (1)  the  total  input  strinq  and 
(2)  t he  phone  position  at  which  the  current  substrinq 
starts.  The  value  of  the  rule  is  false  if  the  rule  does  not 
natch  the  current  input  strinq.  If  the  rule  does  natch, 
then  the  value  is  the  set  of  chanqes  that  should  be  produced 
by  the  rule’s  left  side.  The  function,  MAKECHANGE,  ta'ces 
two  arquaents:  (1)  the  oriqinal  spellinq,  and  (2)  the  set  of 
chanqe  instructions.  The  value  is  a new  spellinq  with  the 
chanqes  Bade.  The  oriqinal  spellinq  (the  value  of 
ARPASPELL)  is  not  altered.  As  can  readily  be  seen  by 
tracinq  throuqh  LOOPER,  each  rule  in  RULES  ET  is  run  at  each 
position  of  the  input  strinq,  with  and  without  other 
applications  of  rules  in  the  set. 

The  critical  difference  between  an  unordered  and  a 

nondete rmimstic  rule  subr  is  the  following  case.  Assume 
that  rule  R1  applies  to  the  input  substrinq  startinq  at 
phone  position  • i,  and  rule  R2  would  apply  to  the  input 
substrinq  as  produced  by  the  tra nsf oria tion  done  by  the  left 
side  of  R1 , but  startinq  at  phone  position  i2  where  i2<i1. 
Then  an  unordered  subr  will  aot  operate  R2  after  R1,  while  a 
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nondeterm  j.nist  ic  subr  will.  The  advantage  of  the  unordered 
subr  is  that  it  is  much  faster.  In  almost  no  case  will  tne 
'"ifference  in  output  be  noticeable. 


5.4  NONDETERMINISTIC  ROLE  SUERS 

The  format  of  a nondeterministic  rule  subr  is  identical  to 
that  of  an  unordered  rule  subr  except  that  the  keyword 
NODETEKM  is  used  r;ather  than  UNORDERED.  For  example: 

SUDR  NODETERM  XYZ  R1,R2,R3; 

defines  the  rule-applyinq  subr  XYZ  with  rule  set  Rl,  R2,  and 
R3.  Fiqure  3 shows  the  alqorithm  used  for  nondeterministic 
rule  explication.  In  operation,  a nondeterministic  subr 
attempts  to  apply  every  member  of  the  rule  set  aqainst  every 
possible  substrinq  of  the  input  and  the  derived  strings. 
This  process  continues  until  no  new  spellings  can  be 
generated  and  then  terminates. 


6.  THE  QUERY  COMMAND 

A ? may  be  used  to  query  the  system  for  the  names  of  the 
defined  entries.  The  four  forms  of  the  <query>  command  are: 

? RULES; 

? SUBRS; 

? SLEXS; 

? LEX; 

Ihe  respective  meaninqs  are:  (1)  output  the  names  of  all 
defined  rules,  (2)  output  the  names  of  all  defined 
rule-applyinq  subrs,  (3)  output  the  names  of  all  defined 
sub -lex  icons , and  (4)  output  the  names  (not  the  spellings) 
of  all  defined  lexicon  entries.  All  output  is  to  the  user's 
terminal.  For  more  detailed  output,  see  Section  4 on  the 
lexicon  command  and  Section  8 on  the  output  commands. 


7.  RUN_C0MM ANDS 

There  are  three  commands  that  may  be  used  to  run  a rule  subr 
aqainst  a form,  a word  in  the  lexicon,  a sub- lexicon,  or  the 
whole  lexicon.  The  commands  are  <run>,  <1oe>,  and  <mary>. 2 
The  run  command  beqins  with  the  word  RON  followed  by  the 
name  of  a rule-applyinq  subr  and  the  specification  ot  the 


2 The  names  were  chosen  purely  arbitrarily. 
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DEFINE  RUNHULE ( RULES  ET  , ABPASPELL) 

DO  D0NESET:=EMPTY; 

TBYSET:=SET  OF  (ARPASPELL)  ; 

XtIF  EMPTY  (TRYSET)  THEN  GO  D; 

SPELL  : =CHOICE  OF  (TRYSET)  ; 

ADD  SPELL  TO  DON  ES  ET  ; 

REMOVE  SPELL  PROM  TRYSET; 

FOR  I:«1  STEP  1 UNTIL  I>L ENGTH  (SPELL ) 

DO  FOR  RULE  IN  RULESET 

DO  CHANGES  :=  RULE  (SP ELL,  I)  : 

IF  CHANGES 

THEN  DO  NEWSPELL:=HAKECHANGE(  ARPASPELL , 

CHANGES)  ; 

IF  NEHSPELL  NOT  IN  TRYSET  AND 
NEHSPELL  NOT  IN  DONESET 
THEN  ADD  NEWSPELL  TO  TRYSET; 

END; 

END  FOR; 

END  FOR; 

GO  X; 

D:  FOR  SPELL  IN  DONESET  DO  PHI  NT  (SPELL)  ; 

END  RUNRULE; 


Fiqure  3 

NONDETEBNINISTIC  RULE  APPLICATION  ALGORITHM 


input  strinq(s).  The  "joe  command  is  the  word  JOE  followed 
by  the  specification  of  the  input  strinq(s).  JOE  is  an 
unordered  ru le-apply inq  subr  that  uses  ail  rules  that  are 
currently  defined.  It  is  automatically  recompiled  by  an 

implicit  subr  command  whenever  it  is  necessary.  The  mary 
command  is  the  word  MARY  followed  by  the  specification  of 
the  input  strinq(s).  MARY  is  a nondeterministic 

rule-applyinq  subr  that  uses  all  the  rules  that  are 

currently  defined.  It  is  automatically  recompiled  by  an 

implicit  subr  command  whenever  it  is  necessary.  All  forms 
of  the  run  commands  are  terminated  by  semicolons.  When  a 
subr  is  operated,  all  derived  spellinqs  are  output  along 
with  the  names  of  the  rules  that  have  transformed  the 

strinq.  The  forms  of  the  input  string  specification  are 
descri.  >d  by  example. 

RUN  XYZ  (T  EH;  2 S T)  ; 

The  rule-applyinq  subr  XYZ  is  operated  aqainst  the  qiven 
ARPAbet  spellinq  with  exterior  word  boundaries  appended 
automatically. 


JOE  TEST; 

The  rule-applyinq  subr  JOE  is  operated  aqainst  each  form  of 
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the  word  TEST  that  is  in  the  lexicon.  If  necessary,  JOE  is 
recompiled . 


MARY  TEST  :2; 

The  rule-applyinq  subr  NARY  is  operated  aqainst  the  second 
form  of  the  word  TEST  from  the  lexicon.  If  necessary,  MARY 
is  automatically  recompiled. 


RON  ABC  LEX; 

The  rule-applyinq  subr  ABC  is  operated  aqainst  each  form  of 
each  word  in  the  entire  lexicon. 


RUN  JOE  LEX  FOu; 

The  rule-applyinq  subr  JOE  is  run  aqainst  all  forms  in  the 
sub-lexicon  FOO . This  command  is  exactly  equivalent  to 


JOE  LEX  FOO; 


OUTPUT  COMMANDS 


Three  commands  are  available  for  the  output  of  defined 
obiects  and  the  output  of  the  results  of  some  run  commands. 
Output  may  be  to  the  user's  terminal,  printer,  or  disk.  The 
<output>  commands  beqin  with  the  name  of  the  device: 
TERMINAL,  PRINTER,  or  DISK.  If  DISK  is  used,  then  a file 
name  is  also  qiven.3  The  output  options,  separated  by 
commas,  follow  the  device  (and  file)  specification.  The 
possible  output  options  and  their  meaninqs  are: 


SUBR S - output  the  current  definition  of  all  rule 
applyinq  subrs  as  subr  commands. 


RULES  - output  the  current  definition  of  all  rules  as  $ 
commands. 


SI.EXS  - output  the  current  definition  of  all 
sub-lexicons  as  slex  commands. 


LEX  - output  the  current  definition  of  each  lexicon 
entry  (all  forms)  as  lex  commands. 


ALL  - equivalent  to  the  output  option  sequence  LEX, 
SLEXS,  RULES, SUERS. 


3 The  knowledqeable  LISP  user  may  instead  specify  a file 
descriptor  list.  A file  name  alone,  say  PN,  is  equivalent 
to  the  file  descriptor  list  (FN  INFIX  AW).  In  any  event, 
if  the  selected  disk  output  file  already  exists,  it  is 
erased  before  the  command  is  executed. 
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SUBR  s - output  the  current  definition  of  the  rule 
applyinq  subr  s as  a subr  command. 

RULE  r - output  the  current  definition  of  rule  r as  a $ 
command. 

SLEX  1 - output  the  current  definition  of  sub-lexicon  1 
as  an  slex  command. 

RUN  s 1 - (where  s is  a rule-applying  subr  name 
mcludinq  JOE  and  MARY,  and  1 is  a sub-lexicon  name  or 
the  word  LEX)  applies  s to  all  forms  specified  by  1; 
each  oriqinal  and  all  derived  forms  are  output. 

All  ou t du t options  except  run  print  in  a format  that  is 
proper  for  recompilation.  Thus,  the  command, 

DISK  XYZ  ALL; 

would  output,  all  current  definitions  to  the  file  XYZ  in  a 
recompilable  format.  To  compile  the  contents  of  a dish 
file,  use  a dcomp  command  such  as 

DCOM P XYZ; 

The  combination  of  a disk  and  dcrmp  command  may  be  used  to 
obviate  the  necessity  of  savinq  the  entire  system  module  to 
preserve  work  in  proqress. 

Other  examples  of  output  commands  are 

TERMINAL  SUBR  FOO,  RULE  FLAP; 

PRINTER  RUN  FOO  SL1; 

The  first  command  outputs  the  definition  of  the 
cule-a  pplyinq  subr  FOO  and  the  rule  FLAP  to  the  user's 
terminal.  The  second  command  outputs  the  results  of 
operatinq  the  rule-applyinq  subr  FOO  aqainst  each  form  in 
t he  sut-  lex  icon  SL 1 to  the  hiqh  speed  printer. 

For  some  uses,  the  query  command  is  more  economical  — see 
Section  6. 


9.  P_elete_command 


A delete  command  may  be  used  to  remove  from 
rule-applyinq  subr,  a rule,  a sub-lexicon, 
entry.  The  forms  of  the  command  are  described 


the  system  a 
or  a lexicon 
by  examples. 
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DEL  SUBR  XYZ; 

The  rule  applyinq  subr  XYZ  is  removed  from  the  system.  All 
associated  symbolic  data  and  code  are  erased. 

DEI,  RULE  ABC; 

The  rule  ABC  is  removed  from  the  system.  All  associated 
symbolic  data  and  code  are  erased.  Rule-applying  subrs  that 
reference  this  rule  should  be  edited  and  recompiled. 

DEL  SLEX  ORS; 

The  definition  of  the  sub- lexicon  QRS  is  removed  from  the 
system . 


DEL  TOTAL; 

The  lexicon  entry  TOTAL  and  all  of 
from  the  lexicon.  Sub-lexicons  that 
any  of  its  forms  should  be  edited  and 


its  forms  are  deleted 
reference  this  word  or 
recompiled. 


10. 


THE EDITOR_AND_THE_RECOMPILE_COMM Ah  0 


This  section  describes  a mini-editor  that  may  be  used  for 
correcting  input  and  a command  for  enterinq  edit,  mode  with 
current  definitions. 


10.  1 


THE  EDITOR 


The  editor  is  automatically  entered  when  a sy 
detected  or  when  certain  ether  error  condition 
input  to  the  LISP  INFIX  compiler  (and  hence  co 
rule  system)  are  viewed  as  token  springs, 
tokens  are  identifiers,  unsiqned  numbers,  a 
characters  such  as  colon,  plus  ( +)  , and  equal 
compilation,  tokens  are  input  and  added  to  a 
input  list".  When  an  error  is  detected,  t 

available  for  editinq.  If  input  is  from  a 
part  of  the  input  line  past  the  point  at  which 
detected  is  lest.  If  input  is  from  any  other 
current  character  and  token  positions  are  mai 
point  at  which  further  input  may  be  found  after 


ntax  error  is 
s occur.  The 
mmands  to  the 
Examples  of 
nd  delimiter 
(=)  . During 
"last  tokens 
his  list  is 
terminal,  the 
the  error  is 
device,  the 
ntained  as  a 
editinq  . 


Upon  entry 
output  alonq 
input  list, 
problem  is  s 
for  an  iden 
maintained  - 
and  next.  A 
the  front  o 
input  source 
completed  in 


to  the 
with  the 
I f no 
one  qener 
t if ier. 

last  ( 
t the  com 
f next, 
for  the 
this  lis 


editor,  an  optional  error  messaqe  is 
last  several  tokens  on  the  last  tokens 
specific  error  message  is  issued,  the 
al  syntax  malady,  such  as  using  a comma 
During  editinq,  two  token  lists  are 
initially  the  last  tokens  input  list) 
pletion  of  editinq,  last  is  appended  to 
The  combined  list  is  then  used  as  the 
compilation.  If  the  command  is  not 
t,  more  tokens  are  input  from  the  file 
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that  was  in  use  when  the  error  occurred.  If  a further  error 
occurs,  the  editor  is  re-entered.  If  the  error  occurs  while 
usinq  the  list,  tfen  the  tokens  up  to  and  including  the 
error  are  on  the  last  tokens  input  list  (last)  and  the 
remaining  tokens  are  on  next.  Parts  of  both  last  and  next 
are  output  upon  editor  entry  if  they  are  uot  empty. 

Several  commands  are  available  in  the  editor  to  manipulate 
last  and  next:  ML,  MN,  PL,  PN,  DL,  DN,  AL,  and  AN.  The 
commands  T and  E are  also  available  to  continue  or  abort  the 
compilation.  Multiple  commands  may  appear  on  one  line,  and 
a sinqle  command  may  stretch  over  multiple  lines.  The 
following  paragraphs  describe  the  commands. 


1 0 . 1 . 1 Ml  and  hN  Commands 

ML  is  followed  fcy  a positive  inteqer.  The  specified  number 
of  tokens  are  moved  from  next  to  last.  MN  is  used  in  a like 
manner  to  transfer  tokens  from  last  to  next.  Given  the 
following  initial  values  of  last  and  next: 

last  = SUBR  F00  ALLOF  ) 
next  - A , B ) , C ; 

the  command  ML  2 would  produce 

last  * SUBR  F00  ALLOF  ) A , 
next  = B ) , C ; 

and  the  command  MN  2 would  produce 


last  = SUBR  F00 

next  = ALLOF  ) A , B ) , C ; 


10.1.2  PL  and  PN  Commands 


PL  and  PN  are  followed  by  a positive  inteqer.  The  specified 
number  of  tokens  on  last  or  next,  respectively,  are  printed. 
Given  the  initial  configuration 


last  = $ FLAP  DX  = VOWEL  / T 
next  = OR  D / VOWEL  ; 


the  command  PL  3 would  output 


VOWEL  / T 

and  the  command  PN  3 would  output 


OR  D / 
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10.1.3  Dj,_  and  DN  Commands 

DL  and  DN  f.L-e  followed  by  a positive  inteqer.  They  delete 
the  specified  number  of  tokens  from  last  and  next 
respectively.  Given  the  initial  values  of  last  and  next: 

last  = SLEX  SL1  TOTAL 
next  = , , COULD  ; 

the  command  DL  2 would  produce 

last  = SLEX 
next  = # , COULD  ; 

and  the  command  DN  2 would  produce 

last  = SI.EX  SL 1 TOTAL 
next  = COULD  : 


10. 1.4 


AL_  and AN  Commands 


AL  and  AN  are  followed  by  an  input  sequence.  They  add  the 
input  sequence  to  last  and  next.,  respectively.  The  input 
sequence  is  delimited  (on  both  ends)  by  any  token  that  does 

not  appear  m the  sequence.  Given  the  initial  values  of 
last  and  next 

last  = SLEX  S2  TOTAL  , 
next  = COULD  , ANY  ; 

the  command  AL  / PRODUCT,  / would  produce 

last  = SLEX  S2  TOTAL  , PRODUCT  , 
next  = COULD  , ANY  ; 

and  the  command  AN  / PRODUCT,  / would  produce 

last  = SLEX  S2  TOTAL  , 

next  = PRODUCT  , COULD  , ANY  ; 


10.  1 


T_ Comm and 


The  L command  siqnals  that  editinq  should  be  terminated  and 
that  compiling  should  commence.  The  compiler  restarts  with 
he  towns  in  last  and  next  and  then  returns  to  the  input 
tile  for  any  additional  proqram  text.  if  input  is  from  the 
user  s terminal,  additional  text  may  be  input  on  the  same 
line  as  the  T command  (or  on  followinq  lines).  If  input  is 
from  a device  other  than  the  terminal,  then  reading  resumes, 
after  exhaustion  of  last  and  next,  just  beyond  the  point  at 


L 

;• 
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which  the  error  was  detected.  For  example,  suppose  that  the 
followinq  line  were  inpu'.  fro*  the  terminal: 

SUBR  FOO  ON EF)  A,P/  ,C; 

The  editor  would  respond  with  the  error  messaqe  "MISSING  (". 
The  value  of  last  would  be  SUBR  FOO  ONEOF  ),  and  next  would 
be  empty.  The  remainder  of  the  input  line  (because  it  came 
from  a terminal)  would  be  discarded.  The  remedy  would  be 
the  sequence  of  commands: 

DL  1 T (A,B)  ,C; 

DL  1 would  delete  the  erroneous  ")  ",  and  the  T command  would 
initiate  the  re-compilation.  The  total  token  sequence  input 
to  the  compiler  would  then  be 

SUBR  FOO  ONEOF  ( A , B ) , C ; 


10.1.6  E Command 

An  E command  exits  (aborts)  compilation  of  the  current 
input.  This  command  is  recoqnized  only  by  the  editor.  If 
input  is  from  the  terminal,  the  the  command  supervisor  will 
be  left  in  a position  to  accept  the  next  command.  If  input 
is  from  some  other  device,  the  rest  of  the  input  file  is 
ski  pped . 


10.1.7  General  Comments  About  Editing 

ML,  MN,  PL,  PN,  DL,  and  DN  commands  receive  a token  count 
(an  inteqer)  as  an  arqument.  If  the  count  exceeds  the 
number  of  tokens  in  the  list  (last  or  next)  specified  by  the 
command,  then  the  whole  list  is  moved,  printed,  or  deleted, 
as  is  appropriate.  An  input,  to  the  system  may  be  broken 
with  a % preceded  by  a space.  If  you  are  in  the  eiitor,  you 
will  stay  there.  If  not,  you  will  be  put  into  the  editor 
with  the  messaqe,  "ESCAPE". 


10.2 


THE  RECOMPILE  COMMAND 


A recompile  command  is  used  to 
definition  of  a rule-apply inq 
sub-lexicon.  Assume  that  S is  a 
name,  and  SL  is  a sub-lexicon  name, 
command  are: 


edit  and  recompile  the 
subr,  a rule,  or  a 
subr  name,  R is  a rule 
Then  three  forms  of  the 
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RECCMP  SUES  S; 

RECOUP  RULE  R; 

RECOUP  SLEX  SL; 

When  definitions  are  input  to  the  system,  the  symbolic  token 
strinq  is  kept  "two  deep".  That  is,  the  latest  and  next  to 
latest  definitions  are  maintained.  The  above  commands  work 
on  the  latest  definitions.  To  work  on  the  older 

definitions,  use  commands  of  the  form: 

RECOMP  OLD  SUBR  S; 

RECOMP  OLD  RULE  R; 

RECOMP  CLD  SLEX  SL; 

The  recomp  command  causes  the  specified  definition,  as  a 
token  strinq,  to  become  the  value  of  the  editor's  list, 
next.  Last  is  emptied,  and  the  editor  is  entered.  You  may 
then  make  any  appropriate  chanqes  and  qive  a T command  to 
recompile  the  definition  with  the  modification (s)  . For 
example,  suppose  the  followinq  definition  is  made  and  used: 

SUBR  F00  ALLOf  (A,B)  ,ONEOP  (C,D)  ; 

The  command 

RECOMP  SUBR  FOO; 

is  qiven.  Then  the  followinq  edit  commands  are  issued: 

ML  2 DL  1AL  / ONEOF  / T 

The  new  defirition  of  FOO  is 

SUBR  FOO  ONEOF  (A,  B)  , CNEOF(C,D); 

and  the  oriqinal  definition  is  the  one  useable  by  the  word 
OLD.  To  restore  the  old  definition  and  make  the  current 
definition  the  cld  one,  use  the  command 

RECOMP  OLD  SUBR  FOO; 

Simply  qive  the  editor  the  T command,  and  the  recompilation 
and  swan  will  occur. 

When  definitions  are  output  (by  an  output  command),  only  the 
current  (or  latest)  definition  is  printed.  The  delete 
command  deletes  both  the  current  and  the  old  definitions. 
Whon  definitions  are  made  for  which  no  current  definitions 
exist,  the  present  input  becomes  both  the  current  and  the 
old  definition.  If  an  uncorrected  error  occurs  in  a 
definition  (qivinq  the  editor  the  E command)  no  chanqes  or 
additions  occur  in  the  saved  copies. 
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11. 


SI  ST EIJ_ ASPECTS 


This  section  describes  soie  miscellaneous  features  and 
capabilities  of  the  phonoloqical  rule  testing  system  as 
opposed  to  the  command  lanquaqe.  Even  though  iany  of  the 
discussed  i.ems  are  of  interest  only  to  one  usinq  the  system 
as  a function  library,  others  usinq  the  system  can  benefit 
from  a quick  reading  of  this  section. 


The  phonoloqical  rule  testinq  system 
LISP.  The  commands  are  implemented  as 
IISP  INFIX  lanquaqe.  It  is  possible 
system  commands  with  input  to  the 
evaluator.  Therefore,  some  commands 

errors  may  be  interpreted  as  LISP.  When  this  happens,  the 
resulting  error  messaqes  and/or  evaluations  are  based 
the  standard  LISP  rules. 


is  embedded  in  SDC 
an  extension  of  the 
to  intersperse  rule 
LISP  compiler  and 
that  contain  syntax 


upon 


The  followinq  subsections  describe  interaction  with  the 
operating  system,  compiler  switches,  functions  that  process 
AKPAfcet,  and  the  subr  execution  support  routines. 


11-1  INTERACTION  KITH  THE  OPERATING  SYSTEM 

SDC  LISP  and  hence  the  phonoloqical  rule  testinq  system 
operate  under  the  VM/370  monitor  usinq  CJ1S  — see  [3  1. 
.terminal  connection  is  made  to  the  system  via  either  a 
telephone  or  the  ARPA  network  --  see  [4].  The  followinq 

describes  the  procedures  usinq  a telephone  connection. 
Section  11.1.7  describes  the  differences  usinq  the  network. 


11.1.1 


Logging  in  and  Loading 


After  you 
tne  herald 
loqin,  hit 
Then 


dial  Vfl  on  a telephone  line,  the  system  outputs 
"VM/370  ONLINE"  and  enters  an  idle  state.  To 
the  carriaqe  return  and  wait  for  the  output  of 
type  a loqin  command. 


L user  pass 


User  is  the  user's  account  name  and  pass  is  a password 
associated  with  that  account.  (The  loqin  and  all  other 
commands  and  input  lines  are  terminated  by  a carriage 
return.)  The  system  responds  to  the  loqin  with  the  output  of 
a variety  of  qreetir.q  and  informative  messages.  You  are  now 
loqqed  into  the  system  and  may  issue  commands  to  the 
monitor. 


At  this  point,  you  will  wish  to  load  and  use  the 
rule-testing  system.  This  is  possible  only  if  the  RULELIB 
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191  disk  is  attached  to  your  loqin.  If  it  is  not,  then 
enter  the  coaaand  sequence, 

LINK  ROLELIB  191  198  HR 
ACC  198  B 

To  load  the  systea,  input  the  coaaand 
TESTRULE 

The  system  is  loaded,  and  LISP  outputs  its  set  of  greetinq 
aessaqes.  A state  has  been  reached  where  you  may  now  enter 
rule  language  coaaands  and  LISP  expressions. 


1 '1. 1.2  Line  Editing  Characters 

The  operating  systea  provides  a minimal  line  editing 
capa1 ilitv.  The  input  of  certain  characters  affects  the 
composition  of  the  line.  The  default  line  editing 
characters  and  their  aeaninqs  are:  d (delete  this  and  the 

previous  character  froa  the  input  line)  ; r (delete  this 
character  and  all  previous  characters  from  the  input  line)  ; 
# (loqical  end  of  line  — used  to  input  two  or  more  logical 
lines  on  the  sane  physical  line) ; and  " (accept  the  next 
character  as  is  — do  not  interpret  it  as  a line  editinq 
character) . 

As  luck  would  have  it,  each  of  the  chosen  four  line  editing 
characters  has  a usaqe  in  the  rule  testing  language  or  INFIX 
LISP.  Therefore,  it  is  stronqly  recommended  that  some  such 
command  as 

TERM  CH  { LINED  _ LINEN  OFF  ES  OFF 

be  entered  to  the  monitor  (not  to  LISP  — perhaps  before  you 
load  the  rule-testinq  system).  The  result  of  the  above  TERM 
command  is  to  make  " ("  the  character  delete  instead  of  "a)”, 
" " the  line  delete  insteau  of  "f",  and  to  turn  the  logical 
line  end  and  escape  character  facilities  off. 


11.1.3  Prompts  and  Breaks 

VM/370  handles  all  terminal  traffic  (network  or  telephone) 
in  half-duplex  mode.  Therefore,  input  should  not  be  entered 
unless  it  is  expected  by  the  monitor.  When  input  is 
expected,  a prompt  character  is  output.  The  default  prompt 
is  a bell.  If  desired,  the  prompt  may  be  changed  by  giving 
the  monitor  (not  LISP)  the  command 


TERM  READ  c 
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where  c is  a non-numeric  character.  After  this  command,  c 
will  be  output  instead  of  a bell  whenever  input  is  expected. 

If  there  is  input  when  not  expected  , or  the  break  key 
(attention  button  on  2741  terminals)  is  depressed  at  any 
time,  then  the  aonitor  is  entered.  AT  this  point,  "!"  is 
output  and  one  of  four  actions  nay  be  taken:  (1)  input  a 

carriaqe  return  (proqraa  execution  will  resuae)  ; (2)  enter 

HT  (terminal  output  will  cease  until  the  next  terminal  input 
request  is  issued):  (3)  enter  RT  (cancels  the  last  HT 
connand  and  resumes  terminal  output);  and  (4)  enter  HX 
(execution  of  the  currently  loaded  proqram  is  permanently 
discontinued  after  output  of  any  stacked  terminal  lines) . 
If  instead  of  one  break,  two  are  input  in  reasonably  rapid 
succession,  then  CP  is  output  and  you  are  in  a position  to 
interact  with  the  CP  monitor  component.  To  resume  execution 
of  the  loaded  proqram,  input  "BEGIN". 


11.  1.4  Monitor  Commands  from  LISP 

Terminal  lines  input  to  LISP  that  beqin  with  "/"  have  a 
special  interpretation.  If  the  first  two  characters  are 
"//",  then  the  CHS  subset  mode  is  entered.  Any  CP  or  CMS 
subset  commands  may  be  input  and  executed.  To  return 
control  to  LISP,  input  the  command  "RETURN". 


If  the  first  character  of  an  input  terminal  line  is  "/"  and 
the  second  character  is  not,  the  the  entire  line  except  for 
the  first  character  is  passed  through  to  CP  for  execution. 
For  instance,  to  send  a message  to  the  terminal  of  the  user 
with  account  name  JOHN,  input 


/H  JOHN  message  text 


to  LISP.  If  you  are  talkinq  directly  to  the  monitor,  the 
command  is  entered  without  the  "/".  Note,  the  above  usage 
of  "/"  to  communicate  with  the  monitor  can  conflict  with  the 
delimitation  of  the  nucleus  portion  of  a rule  definition. 
For  the  latter  use,  make  sure  it  is  not  the  first  character 
input  on  a terminal  line  — if  necessary,  type  a blank 
first. 


11.1.5  LISP  Return  and  Logging  Out 

To  return  from  LISP  (and  the  rule  testinq  system),  enter  the 
command 

RCNS; 

This  returns  you  to  the  state  that  you  were  in  before 
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loadinq  the  systea.  The  aonitor  command  to  loq  out  of 
VH/370  entirely  is 

LOG 


Before  returninq  to  the  systea,  it  aay  be  desired  to  save 
the  current  work.  This  can  aost  easily  be  accoaplished 
usinq  the  output  coaaands  — see  section  8.  Another  method 
is  available  that  saves  the  entire  aodule,  currently  in 
operation,  on  disk.  Assuae  that  it  is  desired  save  the 
aodule  as  MYTESTER;4  then  input  the  coaaand 

SAVE  ("MYTESTER)  ; 

The  aodule  is  written  to  disk  and  may  later  be  reloaded  by 
the  aonitor  coaaand 


MYTESTER 

Any  identifier  of  eiqht  or  fewer  characters  that  does  not 
contain  any  special  characters  aay  be  used  as  the  module 
save  name  instead  of  MYTESTER.  The  save  command  is  an 
exaaple  of  an  ordinary  LISP  expression  that  is  not  part  of 
the  rule  testinq  lanquaqe.  The  quote  mark  used  in  the  save 
command  informs  LISP  that  the  name  is  an  identifier  datum 
and  not  a variable  name. 

A module  save  takes  more  than  300,000  bytes  of  disk. 
Therefore,  it  should  not  be  used  promiscuously. 


1 1 ■ 1 • b Errors  and  Warnings 

Besides  syntax  errors,  other  anomalies  may  be  detected  by 
the  compiler  and  run  time  support  package.  If  a 
rule-applyinq  subr  definition  that  contains  references  to 
undefined  or  deleted  rules  is  compiled,  a warning  diagnostic 
is  issued  --  the  subr  is  still  compiled.  If  a subr 
containing  such  a reference  is  executed,  an  error  message  is 
output  and  an  error  state  is  entered. 

Entry  to  an  error  state  (usually  following  an  execution  time 
infraction)  is  announced  by  the  output  of  a message  that 
characterizes  the  offense  and  the  question 

PRINT  UNWRAP  Y/N/D? 


4 GENMOD  is  used  and  will  save  the  file  as 

A1)  . 


(MYTESTER  MODULE 
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As  a naive  LISP  user,  lust  enter  N or  NO  and  control  will 
return  to  command  input.  5 


11.1.7  Network  Osage 

The  IBfl  370  Model  145  at  System  Development  Corporation  is 
connected  to  the  ARPA  network  as  Host  ft  and  is  known  as 
SDC-LAB.  Connection  from  a Terminal  Interface  Processor 
(TIP)  necessitates  usaqe  of  the  two  commands 

iiT  O L 
91  8 


If  the  connection  is  succesfull 
ONLINE"  is  output  alonq  with  a 
character  for  network  users 
should  now  be  qiven.  To  send  a 
command 


y opened 
prompt . 
is  a dot 
break  th 


the 

herald 

"VH/370 

The 

def  au  It 

crom  pt 

A 

loqin 

command 

uqh 

a TIP, 

use  the 

*S  3 

The  treak  is  not  acted  upon  until  any  stackpd  terminal  linos 
have  been  printed  --  therfore,  be  patient.  As  may  be  seen 
from  the  above  TIP  commands,  "i"  has  special  significance. 
In  order  to  send  "i"  through,  it  is  necessary  to  type  "3d)". 
An  alternate  method  is  to  define  another  character  as  "<i"  to 
VM.  For  instance,  the  command 

SET  INPUT  1 7C 

causes  the  input  of  "1"  to  be  translated  to  the  EBCDIC 
character  code  7C,  which  is  the  internal  representation  of 
"9".  Output  is  not  affected. 


The  normal  loqout  command  "LOG"  automatically  closes  the 
network  connection.  If  you  drop  the  connection  by  any  other 
method,  the  lot  is  put  in  a disconnected  state  and  after  a 
respectable  lenqth  of  time  is  forced  off  the  machine.  It  is 
urqed  that  whenever  possible,  the  "LOG"  command  be  used  so 
as  to  not  tie  up  resources. 


For  network  connection  procedures  from  other  than  a 
consult  the  TELNET  documentation  for  your  local  host. 


TIP  , 


5 A Y or  YES  response  outputs  a stack  backtrace 
returninq  to  command  state.  A D response  enters  a 
debuq  supervisor  that  evaluates  LISP  S-expressi  ons. 
debuq  state,  use  the  word  EXIT. 


before 
special 
To  exi  t 
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11.2  COMPILER  PIAGS 


Three  flaqs  (LISP  variables)  are  available  to  control  to 
soae  extent  the  amount  of  editinq  text  saved  and  the  amount 
of  compiler  output.  The  flaq  FFLG: 65  determines  whether 
symbolic  definitions  of  rules  are  saved  for  use  by  later 
recompile  and  output  commands.  Initially,  the  flaq  is  on. 
To  turn  it  off,  enter 

RPLG : 65=NIL  ; 

The  flaq  SFLG:65  determines  whether  symbolic  definitions  of 
rule-apply inq  subrs  are  saved  for  later  recompile  and  output 
commands.  Initially,  the  flaq  is  on.  To  turn  it  off,  enter 

SPLG:65=NIL; 

The  flaqs  may  be  turned  back  on  by  the  commands 

RFIG :65=T; 

SFLG: 65=T ; 

The  flaq  TFLG : 65  determines  whether  the  compiler  prints 
results  of  the  compilation  of  rules  and  rule-applying  subrs. 
Initially,  the  flaq  is  off.  To  turn  it  on,  enter 

TFLG : 6 5=T ; 

To  turn  it  off  aqain,  enter 
TFLG :65=  NT  L; 

If  the  flaq  is  on,  then  the  oriqinal  input  and  the 
transformations  produced  hy  each  pass  of  the  compiler  are 
outou*- . 


11.3  ARPABE1  SPELLING  PACKAGE 

Internally,  ARPAbot  spellinqs  are  maintained  as  integer 
arrays  with  each  phene  represented  by  a four  byte  (32-bit) 
integer.  The  information  in  each  byte  is 


by  t e 0 
byte  1 
byte  2 
byte  3 


kind,  vcicinq , and  stress  level 
consonant  class 

consonant  place  of  articulation 
6-bit  representation  of  phone's  name 


The  redundancy  in  the  representation  is  used  for  a speed 
advantaqe  by  the  rules. 

A spelling  is  built  using  the  functions  CHECK  SPELL  and 
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HAKESPELL.  CHECKSPELL  has  one  arqument,  a list  of  phone 
names,  and  inteqers.  If  the  list  is  leqal  — e.q.,  all 

list  iteas  are  phone  oam?s,  M:M  only  follows  full  vowel 
naaes,  inteqers  only  follow  and  all  inteqers  ~e  0,  1, 

or  2 --  then  CHECKSPBLL  returns  the  list  in  a ••normal”  form 
with  exterior  word  boundaries  appended.  If  the  list  is  not 
leqal,  error  messaqes  are  output,  and  NIL  is  returned.  The 
function  MAKESPELL  takes  as  an  arquaent  a normal  form  value 
of  CHECKSPELL  and  converts  the  representation  to  an  integer 
arra  y. 

The  spellims  of  forms  associated  with  a word  may  be 
retrieved  usinq  the  macro  SPELL.  For  example, 

SPEI  L ("TOTAL) 

returns  a list  of  the  forms  of  the  word  TOTAL.  Each  form  is 
an  inteqer  array.  Given  an  inteqer  array  spellinq,  the 
(identifier)  name  of  the  Ith  phone  is  retrieved  by 


(?RTEAHE(S,  I) 


where  s is  the  array.  Assuming  that  the  Ith  phone  is  a 
vowel,  its  stress  level  (an  inteqer)  is  retrieved  by 

GETSTRESS  (s,I) 

The  macros  NCCDE,  PCODE,  and  ECODE  are  available  to  examine 
the  structure  of  an  inteqer  phone  representation.  The 
arqument  to  NCODE  is  a phone  name  --  the  value  is  the 
phone's  8-bit  name  code.  The  arqument  to  PCODE  is  a phone 
name  --  the  value  is  the  32-bit  inteqer  representation  of 
that  phone  (sans  stress  level).  The  arqument  to  FCODE  is  a 
feature  name  (STFESS0,  STHESS1,  and  STRESS2  for  stress 
levels,  and  STFES5  for  the  entire  STRESS  field)  — the  value 
is  a list  of  two  inteqers:  (1)  the  byte  in  which 

information  for  that  feature  is  maintained  (0,  1,  2,  or  3), 
and  (?)  a bit-mask  that  may  fee  used  for  extracting  all 
relevent  feature  information  from  the  selected  byte. 

The  array  C2PHAFY  has  256  elements,  each  element 

corresponding  to  an  8-bit  phone  code.  If  p is  the  name  of  a 
phone,  then  the  expression 

C2PHARYr  PCODEC'p) + 11  EQ  "p 
is  always  true. 

The  function  PFNSPL  has  one  arqument,  an  inteqer  spellinq 
array.  It  prints  the  spellinq  in  a compressed  form;  i.e., 
no  blanks  or  ":•'  are  output.  It  also  prints  (on  the  same 
line  if  possible)  the  elements  in  the  list  ROLES.  The  value 
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of  ROLES  usually  is  the  set  of  rules  that  have  performed 

S!3  0"3^0115  to  Produce  this  spellinq.  Another  function, 

PRMHAP,  behaves  similarly  to  PRNSPL.  PRNflAP  has  no 

arquments.  The  array  printed  is  the  value  of  the  variable 
SPELLARY  and  the  number  of  phones  output  is  equal  to  the 
value  of  the  variable  SPELLEN. 

Caution  should  be  exercised  when  usinq  the  above  functions 
and  macros.  They  do  little  error  checking  and  can,  if 
misused,  lead  to  unrecoverable  errors.  All  are  name 

protected  through  the  section  mechanism.  The  names 

CHECKSPELL,  MAKESPELL,  SPELL,  GETNANE,  GETSTRESS,  NCODE 
PCODE,  and  FCODE  are  in  section  1;  C 2 PH  ARY  is  in  section  65; 
and  PRNSPL  and  PRNMAP  are  in  section  66.  Thus,  for  example 
it  is  necessary  to  enter  NCODE:  1 rather  than  -just  NCODE. 
See  the  next  section  for  full  names  of  the  variables 

SPELLARY,  SPELLEN,  and  RULES. 


11.4  EXECUTION  SUPPORT  PACKAGE 

This  section  briefly  describes  the  necessary  protocol  to  use 
rules  and  rule- applying  subrs  other  than  through 
rule -testinq  commands.  The  original  intended  usage  of  the 
system  was  in  this  mode  as  a dynamic  component  of  a speech 
understanding  system  — see  [21  and  rsf. 

The  LISP  sectioning  mechanism  has  been  used  to  minimize  name 
conflicts  of  system  components  and  to  aid  program 
organization.  The  sections  used  by  the  phonological  rules 
system  are: 


qeneral  utility  functions 
rule,  su br , and  command  compiler  internal 
execution  support  packaqe 
ordered  rule  subrs 

unordered  and  nondeterministic  rule  subrs 
rule  functions 
compiler  command  handlers 

Section  1 is  used  by  other  components  of  the  speech 
understanding  system,  and  section  113  is  used  by  most 
lanquaqe  extention  facilities  in  LISP.  Unless  specifically 
stated  to  the  contrary,  all  support  functions  and  variables 
discussed  below  =»re  in  section  66. 


65  - 

66  - 

67  - 

68  - 
69  - 
113  - 


Internal_.arra.Y_  Handling 


When  rules  are  applied,  they  may  match  some  part  of  the 
input  strinq  (an  integer  array)  and  transform  it.  For 
unordered  and  nondeterministic  rule  application,  it  is 
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imperative  that  the  original  string  not  be  damaged. 

Therefore,  new  arrays  must  be  allocated  to  hold  the 
transformed  spellings.  Because  this  happens  frequently  and 
can  cause  time  consuming  garbage  collects,  an  '‘erasure'1 
scheme  has  been  adopted.  To  allocate  an  array,  use 

CPEATEARY  () 

The  value  is  an  array  into  which  spellings  may  be  copied. 
To  return  the  array  A to  a pool  for  later  use,  use 

EFASEARY (A) 

The  pool  of  available  arrays  is  maintained 
ERASEL.  The  length  of  arrays  allocated  by 
equal  to  the  value  of  the  variable  ARRAYLEN. 
value  of  ARRAYLEN  is  50.  To  change  this  value 

T ERASEL- NIL, AFRAYLEN=x']; 

This  will  ensure  that  the  pool  of  arrays  of  the  old  length 
is  discarded.  Note  that  the  size  of  these  arrays  should  be 
a little  lonqer  than  the  lonqest  spelling  you  will  ever 
deri  ve . 6 

Spelliiq  arrays  that  hold  lexicon  forms  are  of  t.ie  exact 
lenqth  of  the  spellinq  (in  phones,  including  initial  and 
final  word  boundaries).  The  function  COPA  may  be  used  to 
copy  a spellinq  array  of  an  exact  length  to  an  array 

allocated  by  CBEATEARY.  If  the  value  of  the  variable  CFLG 

is  NIL,  then  a copy  is  not  made  and  the  argument  is 
returned.  Otherwise,  an  array  is  created  and  the  argument 
is  copied  into  it.  The  initial  value  of  CFLG  is  T. 

Because  the  system  normally  operates  with  arrays  that  are 
lonqer  than  the  actual  spellings,  it  is  necessary  to 
communicate  the  actual  ienqths.  As  rules  and  rule-applying 
subrs  are  operated,  they  attempt  to  make  the  value  of  the 
variable  SPELLEN  the  actual  number  of  phones  and  the  value 
of  the  variable  S PEI  LA  RY  the  array  containing  the  spelling. 

The  variable  SPELLINX  normally  is  the  index  of  the  phone 

that  starts  the  currently  visible  input  substring,  and  the 
variable  NEXTARY  is  an  array  in  which  a rule  may  perform  the 
reconstruction  dictated  by  its  left  side. 


on  the  list 
CREATEARY  is 
The  initial 
to  x,  execute 


6 Be  qenerous  --  the  penalty  for  exceeding  this  bound  may  be 
an  unrecoverable  program  check. 
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11.4.2 


Rule  Calling  Sequence 


A rule  is  compiled  as  a function  with  the 
section  69  as  the  function  name.  When  a rule 
is  expected  that  the  phonetic  input  string  be 
that  is  the  value  of  the  variable  SPELLARY 
value  of  the  variable  SPELLINX  te  the  index 
that  starts  the  currently  visible  substring, 
does  not  match  the  substring,  then  NIL  i.~  returned  as  the 
value.  If  the  rule  matches,  then  several  conditions  prevail 

the  value  of  the 


rule  name  in 
is  called,  it 
in  the  array 
and  that  the 
of  the  phone 
If  the  rule 


when  the  rule  function  returns:  (1) 

function  (in  register  AC)  is  the  length 
the  reconstruction  seguence  generated 
side,  (2)  the  identifier  name  of  the 
AC0,  (3)  the  system  entries  BMRK  and 
absolute  locations  of  the  beginning  and 
of  the  part  of  SPELLARY  matched  by  the 


(a  small  integer)  of 
by  the  rule's  left 
rule  is  in  register 
EMRK  are  set  to  the 
just  beyond  the  end 
rule's  nucleus,  and 


(4)  the  system  entry  CHANGES  contains  the  reconstruction 
sequence  generated  by  the  rule's  left  side.  If  the  rule 
function  returns  non-NIL,  then  the  derived  spelling  is 
generated  by  calling  the  proper  reconstruction  function. 
The  reconstructor  used  depends  upon  whether  the  rule 
application  is  ordered,  unordered,  or  nonde te rministic.  For 
ordered  rule  application,  use  a code  seguence  like 

(ARGS)  (CALL  rule)  ( BZH  AC  (LABEL  L)  ) (ARGS)  (CALL  RECO)  L 
for  unordered  rule  application. 


(ARGS)  (CALL  rule)  (BZH  AC  (LABEL  L)  ) (ARGS)  (CALL  RECU)  L 

and  for  nondeterm  inistic  rule  application, 

(ARGS)  (CALL  rule)  (BZH  AC  (LAEBL  L) ) (ARGS)  (CALL  RECN)  L 

where  rule  is  the  rule  name  in  section  69  — for  instance 
(FLAP  . 69).  The  reconstruction  functions  RECO,  RECU,  and 
RECN  each  behave  a little  differently  to  be  compatible  with 
the  different  kinds  of  rule  applying  subrs.  Each  is 
described  below. 


11.4.3  Ordered  Subrs 

An  ordered  rule-applying  subr  is  compiled  as  a two-argument 
function.  The  name  of  the  function  is  the  subr  name  in 
section  67.  The  arguments  are  the  input  spelling  and  the 
length  of  the  spelling.  The  input  spelling  is  copied  by 
COPA  (if  the  value  of  CFLG  is  non  NIL).  The  value  of  the 
variable  SPELLARY  is  set  to  the  input  spelling,  and  the 
value  of  the  variable  SPELLEN  is  set  to  the  spelling's 
length.  Then  the  subr  parts  are  executed,  left  to  right 
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across  the  input,  one  at  a time.  If  a rule  in  a subr  part 
matches  the  input  substring,  then  RECO  is  called 

immediately.  RECO  performs  several  functions:  (1)  sets  the 

value  of  the  variable  RULENAME  to  the  name  of  the  rule 
matched,  (2)  makes  the  changes  in  SPELLARY,  (3)  sets  SPELLEN 
to  the  new  spelling  length  (4)  adds  the  name  of  the  rule 
(value  of  RULENAME)  to  the  list  ROLES,  and  (5)  calls  the 
value  of  the  functional  variable  MAPPER.  In  rule  testing 
mode,  the  value  of  MAPPER  is  the  function  PRNMAP.  The 

variable  SPELLINX  is  the  index  of  the  substring  matched  by 
the  rule.  Its  value  is  available  when  the  value  of  MAPPER 
is  call  .. 

Several  points  should  be  noted  when  any  program  other  than 
the  rule  testinq  system  is  directly  calling  an  ordered 
rnle-apply inq  subr:  (1,  it  is  the  caller's  duty  to  bind  or 

set  CF LG  to  the  proper  value,  (2)  RULES  should  be  re-bound 
so  that  it  will  reflecc  only  the  rules  that  have  operated  on 
this  spellinq,  and  (3)  if  desired  and  appropriate,  call 
EflASEARY  with  the  value  of  SPELLARY  at  the  completion  of 
execution. 


11.4.4  Unoidered  Subrs 

An  unordered  rule-applying  subr  is  compiled  as  a 
three-arqument  function.  The  function  name  is  the  subr  name 
in  section  68.  The  arquments  are  the  spellinq  array  (input 
strinq) , the  number  one,  and  the  spelling's  length.  (The 
second  arqument  is  actually  used  as  i*>e  substrinq  start 
location.  However,  unordered  subrs  are  called  recursively 
and  must  be  initially  "primed"  with  a one.)  Unordered  subrs 
re-bind  numerous  qlobal  variables:  SPELLARY,  SPELLINX, 

SPELLEN,  NEXTARY,  RULES,  SPELLFN,  and  RULENAME.  When  the 
subr  is  entered,  it  immediately  calls  the  value  of  the 
functional  variable  MAPPER  (set  to  PRNMAP  in  rule  testing 
mode) . The  values  of  SPELLARY,  SPELLINX,  SPELLEN,  and  RULES 
are  proper  and  reflect  facts  about  the  spelling  in  SPELLARY. 
The  functional  variable  SPELLFN  is  bound  to  the  subr  itself 
so  that  RECU  can  make  proper  recursive  calls.  The  value  of 
NEXTARY  is  initialized  to  an  array  (by  CREATEARY)  on  each 
subr  entry,  and  is  released  (by  ERASEARY)  on  subr  exit.  The 
array  is  used  by  RECU. 

RECU  is  the  reconstruction  function  used  by  unordered 
rule-apply inq  subrs.  The  name  of  the  rule  that  -just  matched 
is  stuffed  into  the  variable  RULENAME.  Then  the  new 
spellinq  is  constructed  in  NEXTARY  without  modifying  the 
value  of  SPELI.ARY.  Next,  the  value  of  SPELLFN  (the  subr)  is 
called  recursively  with  the  three  arguments  NEXTARY, 
SPELLINX,  and  the  lenqth  of  the  new  spelling  that  is  in 
NEXTAHY. 
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When  an  unordeLed  rule  is  called,  it  is  not  necessary  to 
copy  a spellinq  of  exact  lenqth  into  a lonqer  array  — the 
oriqinal  spellinq  array  is  not  altered.  Before  the  subr  is 
called,  the  value  of  RULENAME  should  be  bound  to  some 
meaninqful  value,  say  the  word  whose  spelling  is  being 
operated  upon.  The  reason  is  that  the  value  of  RULENAME  is 
added  to  RULES  by  the  initial  subr  cell  is  if  it  were  a rule 
name. 


11.4.5  Nondeterministic  Subrs 

A nondeterministic  rule-apply inq  subr  is  compiled  as  a 
three-arqument  function.  The  function  name  is  the  subr  name 
in  section  68.  The  first  arqument  is  the  input,  strinq  (as 
an  inteqer  array),  and  the  third  arqument  is  the  length  of 
the  input  strinq.  The  second  arqument  is  irrelevant.  (This 
makes  call inq  sequences  reasonably  compatible  with  unordered 
rule-applyinq  subr  functions.)  The  subr  binds  several 
special  variables:  SPELLARY,  SPELLINX,  SPEL.EN,  NEXTARY, 
TRYSET.  and  DONESET.  The  value  of  SPELLINX  cannot  be  relied 
upon  in  nondeterministic  subrs.  The  value  of  NEXTARY  is 
initialized  to  an  array  by  CREATEARY.  All  arrays  created  In 
a nondeterministic  subr  are  erased  (by  FRASEARY)  be^re 
exit.  Each  time  a rule  matches  the  input  substring,  RFCN  is 
called.  The  functions  performed  by  RECN  are:  (1)  stuff  the 
value  of  RULENAME  with  the  name  of  the  rule  that  matched, 
(2)  put  the  derived  spellinq  in  NEXTARY,  (3)  add  to  TRYSET7 
a list  of  the  lenqth  of  the  derived  spellinq,  NEXTARY,  and 
the  name  of  the  rules  that  have  participated  in  deriving 
this  spellinq,  and  (4)  set  the  value  of  NEXTARY  to  another 
array  (usinq  CREATEARY)  . 

Eventually.  each  array  on  TRYSET  (including  the  oriqinal 
input  array)  becomes  the  value  of  SPELLARY,  the  lenqth  of 
the  array  is  set  in  SPELLEN,  and  RULES  is  set  to  the  list  of 
rules  that  hava  participated  in  derivinq  this  spellinq.  The 
value  of  che  .unctional  variable  MAPPER  is  then  called.  In 
rule  testing  mode,  the  value  of  MAPPER  is  the  function 
PRNMAP.  As  with  unordered  subrs,  the  value  of  RULENAME 
should  be  bound  to  seme  meaninqful  value  before  the  subr  is 
called.  The  initial  value  of  RULENAME  is  added  to  RULES  as 
if  it  were  a rule  name. 


7 Step  three  is  bypassed  if  the  same  spelling  has  already 
been  generated. 
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APPENDIX  Is  COHHAND  SYNTAX 


The  followinq  summarizes  the  syntax  of  the  commands  in  the 
rule  system.  The  description  is  standard  BN F with  the  usual 
augmentation;  namely,  the  use  of  X means  that  the  following 
term  must  occur  at  least  once  and  may  occur  multiple  times. 
A fora  like  %'x*  means  that  the  following  term  must  occur  at 
least  once  or  nay  occur  multiple  times  separated  by  x's. 
For  example,  *'-'A  means  A,  A-A,  A-A-A,  etc.  Square 
brackets,  \ and  1,  mean  that  the  occurrence  of  the  enclosed 
term  is  optional.  { and  } are  used  as  meta  parer theses. 
The  occurrence  of  | between  terms  means  alteration  i.e.,  a 
choice  of  the  terms. 


The  syntax  of  a rule  definition  is 

<rule>::=$  <r-name>  <lef  t-side>=<right-side>f  <couditional>  ]; 

<r-name>: :=<  identified 

<lef t-side>:  :=NIL | %' , •< left- par t> 

<left-part>: :=<consonant-name>| <fcoundary-narae>| 
<reduced-name>  | <lef t-vowel> | 

< index > | < const ructed-consonant> 

<lef  t- vowel>;  ; =< vow el -designator  >f  < stress-designator>  ] 

<vowel-desiqnator>: ;=<f ull-vowel-name>|  <index> 

<index>: :=<inteqer> 

<st ress-des igna tor> : ;=<explici t-stress> |<borrow> 

<exp licit-stress>; : = :<stress> 

<str ess>: : =0 | 1 | 2 | 

<bor  row> ; : =d<index> 

<constructed -consonant> ; ;= (<class-desiqnator> 

< place -design a tor> 

< voice-designa  tor>) 

<class-desiqnator> : :=<class-name>) CLASS < borrow > 
<place-desiqnator>: :=<place-n ame>| PL  ACE<borrow> 
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<voice-  desiqnator>: : =f  - iVOICEf  <borrow>] 

<riqht-side>: : -<nucleus> | 

f <lef  t-context>  ]/[  <nucleus>1/ 
f <riqht-context>  ] 

<nucleus>::=V, ' <riqht-  part> 

<riqht-context>: : =V  # *<riqht-part> 

<lef  t -conte xt>:  :*  V , *<riqht-part> 

<riqht-part>:  : = <repeat>|<optional>  |<choice> 

<repeat>: :=REP  <ain-count>  <choice> 

<min-count>: : =<inteqer> 

<optional> : : =0PT  <choice> 

<choice>:  :=V0R  *<pat-part> 

<pa  t-part> : :=<consonant-naae> | < boundary-name>)  < reduced- name>| 
<f ull-vowel-na  ae>{  <explici  t-stress>  1 1 
< class-  name>  | < place- nane>|  <kind-naae>| 

VCICEl VOHLL<explicit-stress> | <f eat u re- bundle> | 

♦ < pa  t-  pa  r t > | - < pa  t-  pa  r t> 

<feature-bundle>: : = (X<choice>) 

<conditional>: : =IF  <cond-body> 

<cond-bod y>:  :=X 'OR*  <cond-and> 

<cond-a nd > : : = % ' AND'  <relation> 

<relation>: := <k i nd- test > J <cl ass -test > |< place- test> | 

< stress-  test>  | <name-  test>  | < voice -test>  | 
(<cond-body>) 

<kind-test> : :=KIND<borrow>  { EQ | NQ) 

fKIHD<borrow>{ <kind-name>) 

<cl ass- test >: : =CLASS<borcow>  {BQ| NQ) 

(CLASS<borrow>  |<class-nane>) 

<place-test> : : = PL AC E< borrow > {EQ | NQ) 

f PLACE<bortow> j < place-naae>} 

<stress- test>: : = STRESS<  borrow > f EQ | NQ|GQ|  LQ|GR|  LS) 

{STRESS<borrow>  |<stress>’. 


c 
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L 


<na  ae- test> 


I 

NAME<  borro  w>  (EQ|  NQ} 

(NAME  <borrow> |<bounda ry-naae> |<consonan t-name> | 

<reduced-na  ae> | <fu 11-vowel -name>} 


< voice- test > : : =f  - 1V0ICE<  borrow>| 

VOICE<borrow>  f EQ  | NQ}  VOICE<  borrow> 


<consonant-name>::  = L|W|  Y|R|NX  |N|H|6|B|D|P|T|K  |ZH|Z|DHJ  V| 

SH  | S | TH  | F | JH  | CH  | Q | D 3C | WH  | HH 

<boundary-naae>  : :=*|  # 

<reduced-name>:  :=AX 

<fu ll-vowel-name>: : = I Y ] I H | BY | EH | AE | AA | A H | AO |OW | UH | OW | 

ER  | A W|  A Y | OY 

<class-nane> : : = AFRIC| FHICJ  PLOSJ  NASAL | GL I DE| 

LATERAL)  CENTRAL 


<place-name>: : =LAEIAL (ALVEOLAR JALVPALI DENTAL | 
VELAR  IPALATAL 


<ki  nti-na!ne> : : = BCD NE ARY | CONSTJ  VOWEL 


The  syntax  of  a subr  definition  is 

<subr>: :=  {<unordered-subr> |<ncndeterministic-subr> I 
<ordered-subr>}  ; 

<unorde red- subr >:  : = SDBR  UNORDERED  <s-name> 

X' , ’ <r-na«e> 

<nondeterainistic-subr>: :=SUBR  NODETERM  <s-naae> 

X’ , '<r-naae> 

<ordered-subr>  : :=SUBR  f ORDERED!  <s-naae>  X'  , • <subr-pa rt> 
<s-name>: :=<identif ier> 

<subr-part>: :=<r-naae>| <allof > |<oneof> | < i f > |<unless>| 
(<subr-part>) 

<allof  >: : =ALLOF (X* , ' <subr-  part>) 

<oneof > : : =ONEOF (X* , *<subr-part>) 

<if  > : :=IF  <subr-part>  THEN  <subr-part>  [ELSE  <sufcr-part>} 
<unless>: :=DNLESS  <subr-part>  THEN  <subr-part> 
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F ELSE  <subr-part>l 


The  syntax  of  a recompile  command  is 

<recomp>:  :=RECCHP  fCLDl  {SUBR  <s-name>  | 

ROLE  <r-name>| 
SL EX  <l-n ame>)  ; 


The  syntax  of  a delete  command  is 

<delet e>: : =DEL  (<vord>J 

SUER  <s-name>i 
RULE  <r-name>| 
SLEX  <l-name>)  ; 


The  syntax  of  a sub  lexicon  definition  command  is 
<sub-lexicon>:  :=SLEX  <l-name>  %' , • f <word>r  <f  orm~index>  ]} ; 
<vord>: :=<identifier> 

<form-index>: :=  :<inteqer> 


The  syntax  of  a lexicon  entry  definition  command  is 

<lex icon>: : =LEX  {<lex- print> | <lex-a uq> |<lex-def >}  ; 

<lex-pr  int.>:  : =<word>T  <f  orm-index>  1 

<lex-auq>: : =<vord>*X' , • <acpa-spell> 

<lex-def> ::=< word >< fori- index><arpa-spell>| 

<word  <arpa-spell> 

<arpa-spell>:  :=  (%  f<consonant-name>  Kboundar  y-nao»e>  | 

< reduced-  name>| 

<full-vowel-name>r <explicit-stress>l}  ) 
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The  syntax  of  the  execute  commands  is 
<execute>:  :=  f<run>|<1oe>|<mary>}  ; 
<run>::=RUN  <s-name>  <run-ob1ect> 
<1oe>::=J0E  <run-ob1ect> 

<mary>:  :=MARY  <run-ob1ect> 

< r un-ob iect> : : = LEX  f <1 -nam e> } J <arpa-spell>| 
<word>f  <form-index>  1 


The  syntax  of  the  output  commands  is 
<outpu t> : : = f<terminal> | <pr inter > J <disk>) ; 

<terminal>::  = TERMINAL  <out-sequence> 

<printer>: :=PBINTEF  <out-sequence> 

<disk>: :=DISK  f<f-name> J <f d 1>}  <out-sequence> 
<f-name>: : =< identif ied> 

<out-sequence>:  ; =% • , '<out-spec> 

<out-SDPC>: : =ALL | SUBRS] RULES  | LEX | SLEXSI SUBR  <s-name>| 
RULE  <r -name>  |SL  EX  <l-name>| 

RUN  f<s-name>|  JOE|  MARY}  f<l-name> | LEX) 

The  syntax  of  the  query  command  is 
<query>  : : = ? (RULES|  SUBR SJ  LEX | SLEXS}  ; 
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APPENDIX  2:  PHONOLOGICAL  SYMBOLS  AND  THEIR  FEATURES 


PHONE 

EXAMPLE 

FEATURES 

IY 

beat 

VOWEL, 

stress,  VOICE 

IH 

bit 

VOWEL, 

stress,  VOICE 

EY 

bai  t 

VOWEL, 

stress,  VOICE 

EH 

bet 

VOWEL, 

stress,  VOICE 

AE 

bat 

VOWEL, 

stress,  VOICE 

AA 

bob 

VOWEL, 

stress,  VOICE 

AH 

but 

VOWEL, 

stress,  VOICE 

AO 

bouqht 

VOWEL, 

stress,  VOICE 

OW 

boat 

VOWEL, 

stress,  VOICE 

UH 

book 

VOWEL, 

stress,  VOICE 

UH 

boot 

VOWEL, 

stress,  VOICE 

AX 

about 

VOWEL, 

: 0,  VOICE 

ER 

bi  rd 

VOWEL, 

stress,  VOICE 

AH 

down 

VOWEL, 

stress,  VOICE 

AY 

tuy 

VOWEL, 

stress,  VOICE 

OY 

boy 

VOWEL. 

stress,  VOICE 

Y 

You 

CCNST, 

GLIDE 

. PALATAT,  VOICE 

H 

wit 

CONST, 

GLIDE 

, LABIAL,  VOICE 

R 

rent 

CONST, 

CENTRAL,  A1VECLAH,  VOICE 

I. 

let 

CONST, 

LATERAL,  ALVEOLAR,  VOICE 

M 

met 

CONST, 

NASAL 

, LABIAL,  VOICE 

N 

n et 

CONST, 

NASAL 

, ALVEOLAR,  VOICE 

NX 

sing 

CONST, 

NASAL 

, VELAR,  VOICE 

P 

pet 

CON  ST, 

PLOS, 

LABIAL 

T 

ten 

CONST, 

PLOS, 

ALVEOLAR 

K 

kit 

CCNST, 

PLOS, 

VELAR 

e 

bet 

CONST, 

PLOS, 

LABIAL,  VOICE 

D 

debt 

CCNST, 

PLOS, 

ALVEOLAR,  VOICE 

G 

get 

CONST, 

PLOS, 

VLLAF,  VOICE 

H H 

ha  t 

CONST, 

MISC 

F 

fat 

CONST, 

FRIC, 

LABIAL 

TH 

t hinq 

CCNST, 

FRIC, 

DENTAL 

S 

sat 

CONST, 

FRIC, 

ALVEOLAR 

SH 

shut 

CONST, 

FRIC, 

ALV  PAL 

V 

vat 

CONST, 

FRIC, 

LABIAL,  VOICE 

DH 

that 

CONST, 

FRIC, 

DENTAL,  VOICE 

Z 

zoo 

CONST, 

FRIC, 

ALVEOLAR,  VOICE 

ZH 

azure 

CCNST, 

FRIC, 

ALV  PAL,  VOICE 

CH 

church 

CCNST, 

AFRIC, 

, ALVPAL 

JH 

jud  qe 

CCNST, 

A FP  IC( 

, ALVPAL,  VOICE 

WH 

which 

CCNST, 

MISC, 

LABIAL 

DX 

batter 

CONST, 

MISC, 

ALVEOLAR,  VOICE 

0 

q lot  tfl  1 

st  op 

CONST, 

MISC, 

VOICE 

* 

syllable 

BOUNDARY 

1 

word 

DCUNEAFY 

1 


str  ess 


:0 


: 1 , or  : 2 


